Question:
In one of my database, there is a value in a varchar-field: Brokers México, Intermediario de Aseguro,S.A开发者_运维问答.
Now I make a new column as nvarchar, and want to take over the old values, properly encoded.
Now two questions:
A) In C#/VB.NET, how can I change México back to the proper value ("México"), before storing it in the unicode field (assuming I know the proper source-codepage)?B)
Is there a way to figure out the codepage, if I don't want to do it manually ? (well, asking is free, but I suppose there is none).You might want to try something like this:
string broken = "Brokers México, Intermediario de Aseguro,S.A."; // Get text from database
byte[] encoded = Encoding.GetEncoding(28591).GetBytes(broken);
string corrected = Encoding.UTF8.GetString(encoded);
It really depends on how it's been inserted - that's assuming that something has taken UTF-8 bytes, interpreted them as an ISO-8859-1 string, and then inserted that string into the database. Basically the code performs the same conversion in reverse.
I'm not sure about figuring out the code page - I would guess at ISO-8859-1 and UTF-8 to start with, and if that doesn't work, look at some examples of what's wrong and the correct version.
精彩评论