开发者

c#: How to convert a Unicode character to its ASCII equivalent

开发者 https://www.devze.com 2023-02-13 16:49 出处:网络
I know its a recurrent question here but no one of answers havent work for me. From a system I\'m receiving a Unicode text. Just an email + name from customers.

I know its a recurrent question here but no one of answers havent work for me.

From a system I'm receiving a Unicode text. Just an email + name from customers.

When I record these strings to my SQL DB the appears some chars appears with \u.

For example the emails are getting in the DB: name\u0040domain.com

How I transform the Unicode string in my c# program to ascii, so the DB gets name@domain.com.

Also that replace special chars to equivalent or to no one... For example "Hernán开发者_高级运维 π" to "Hernan "

Thanks!


IMHO converting Unicode back to ASCII for some dubious storage or technical benefit isn't a good idea in the 21st century, especially since email is being changed to support Unicode in headers and bodies.

http://en.wikipedia.org/wiki/Unicode_and_e-mail

If the reason why you want to convert Hernán to Hernan is for searching, you should look at using an Accent Insensitive (AI) collation on your database, or coerce it to do so - see this SO post.

One thing you might need to double check however is that your strings aren't getting preencoded before storage in your database (assuming that your DB column is set to accept unicode - i.e. NVARCHAR etc), the character '@' should be stored as '@' (0040 in UTF 16) and not as '\u0040'.

EDIT: The "\uNNNN" encoding in a string might originate from Java or Python. You might be able to trace the email string data up your architecture to find the source of this encoding and change it to something more easy to decode in C# such as UTF-8.

How do I treat an ASCII string as unicode and unescape the escaped characters in it in python?


You can use Encoding.Convert for such operations. Read about this on MSDN

0

精彩评论

暂无评论...
验证码 换一张
取 消