开发者

Regex for replacing à,Á,Ä etc. -> a, Õ,ò, etc. -> o

开发者 https://www.devze.com 2023-03-31 19:10 出处:网络
Western Latin character set contains characters such as开发者_JAVA技巧 À Á Â Ã Ä Å which have all the same standard char \'a\' as \'radix\'. This happens on e,i,o,etc. as well.

Western Latin character set contains characters such as开发者_JAVA技巧 À Á Â Ã Ä Å which have all the same standard char 'a' as 'radix'. This happens on e,i,o,etc. as well. Is there a regex for replacing these variations to their 'radix' characters?

This would be used to create a seo friendly url from a text (but not limited to):

Example: La cena è pronta => La cena e pronta


Try this:

string str = "La cena è pronta àèéìòùçæÀÈÉÌÒÙÇÆ";
str = str.Normalize(NormalizationForm.FormD); // Or use NormalizationForm.FormKD
str = Regex.Replace(str, @"\p{Mn}", "");
// Result: La cena e pronta aeeioucæAEEIOUCÆ

But note that Æ remains Æ.

0

精彩评论

暂无评论...
验证码 换一张
取 消