开发者

Repair bad character due to encoding problem

开发者 https://www.devze.com 2023-01-01 11:37 出处:网络
Recently we had an encoding p开发者_开发问答roblem in our system : If we had the string \"æ\" in our db ,it became \"æ\" on our web pages.

Recently we had an encoding p开发者_开发问答roblem in our system :

If we had the string "æ" in our db ,it became "æ" on our web pages.

Now this problem is solved, but the problem is that now we have a lot of "æ" in our database : users didn't see and validate pre-filled form with these characters.

I found that If you read in utf 8 C3A6 you'll get "æ", if you read it in ascii you'll get "æ".

It's strange because if I execute

"select convert(varbinary(40),N'æ'),convert(varbinary(40),'æ')"

I don't have the same result...

Do you have any idea on how I can fix my database (ie change all "æ" to "æ") ?

thx


As far as I know, the only means to fix is to use Replace:

Update Table
Set Column = Replace(Column, N'æ', N'æ')

In this case, I'm assuming that the column is now Unicode (i.e. nvarchar or nchar).


if you read it in ascii you'll get "æ".

ASCII only assigns characters to the bytes 00-7F. There are, however, several "extended ASCII" encodings in which C3 A6 represents "æ", including the popular Western European encodings ISO-8859-1 and windows-1252, and Turkish ISO-8859-9 and windows-1254.

To fix your encoding problem, simply:

  1. Encode the string to a byte array using code page 1252 (or 1254 for Turkish). This should produce the UTF-8 bytes.
  2. Decode the byte array to a string using UTF-8.
0

精彩评论

暂无评论...
验证码 换一张
取 消