开发者

Converting text containing COMBINING DIAERESIS to utf-8

开发者 https://www.devze.com 2023-02-27 11:49 出处:网络
We hav some text containing german umlauts repr开发者_开发技巧esented using e.g. \'a\' + COMBINING DIAERESIS

We hav some text containing german umlauts repr开发者_开发技巧esented using e.g. 'a' + COMBINING DIAERESIS ($cc $88).

Any idea how to convert such text properly to utf8?


First, if it's not already a unicode then decode it. Second, unicodedata.normalize(). Third, encode.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号