开发者

Transliteration from Ethiopic (and others) to ASCII (ሀ -> ha; ü -> ue)

开发者 https://www.devze.com 2023-01-15 07:54 出处:网络
I am not yet so good with reading Amharic (Geez / Ethiopic) letters. If I have a text in Ge\'ez (Ethiopia) letters ( http://en.wikipedia.org/wiki/Ge%27ez_language ) I want to transliterate them to AS

I am not yet so good with reading Amharic (Geez / Ethiopic) letters.

If I have a text in Ge'ez (Ethiopia) letters ( http://en.wikipedia.org/wiki/Ge%27ez_language ) I want to transliterate them to ASCII.

When I go with the LYNX Textmode browser to http://www.addismap.com/am/ (webpage in Amharic) it showes me "edis map: yeedis ebeba karta". How can I access this functionality for example in Python, Bash or PHP? Which API do they use?

It seems not to be i开发者_如何学Goconv:

$ iconv -f UTF-8 -t ASCII//TRANSLIT
Input:    ሀ ለ ሐ መ ሠ ረ ሰ
Output:   ? ? ? ? ? ? ?


ICU http://icu-project.org/ has an Amharic-Latin transform, which will turn your text into "hā le ḥā me še re se". You could use this using uconv -x 'Amharic/BGN-Latin' from the command line, or use pyicu.


The Unicode Common Locale Data Repository defines some transliterations. Unidecode (or its Python port) has even more of them.

0

精彩评论

暂无评论...
验证码 换一张
取 消