Possible Duplicate:
How do I remove diacritics (accents) from a string in .NET?
开发者_如何学GoOur project generates an string(Mērā nāma nitina hai) in web page and when we read it using Regex.match function then we get a string in which these special character are converted into some browser code like \&#\257(without backslash) in place of ā . So we want to convert it into 'a' or 'ā'. So that we can use it in further program. Thanks
Im not sure that my method is absolutely right but it works for me:
[EDIT]
string first = @"Mērā nāma nitina hai";
first = System.Web.HttpUtility.HtmlDecode(first);
byte[] ansi = System.Text.Encoding.Convert(Encoding.Unicode, Encoding.GetEncoding(1252), Encoding.Unicode.GetBytes(first));
string output = Encoding.Unicode.GetString(System.Text.Encoding.Convert(Encoding.GetEncoding(1252), Encoding.Unicode, ansi));
MessageBox.Show(output);
The main idea of this code - you are converting your string to ANSI and back to UNICODE. After this action all diacritics is gone away.
How about this:
var correctStr = HttpUtility.HtmlDecode(@"Mērā nāma nitina hai");
Explanation: ā
is an html entity character representing the special accented char with unicode code 257.
You need to use the String.Normalize method.
精彩评论