开发者

CharsetICU java example for char set conversion

开发者 https://www.devze.com 2023-02-16 21:29 出处:网络
I need to conve开发者_开发问答rt a file from EBCDIC (IBM 937) to UTF-8. Any idea how I can use the CharsetICU (icu4j API) for charset conversion?There is no need to use external libraries to do this c

I need to conve开发者_开发问答rt a file from EBCDIC (IBM 937) to UTF-8. Any idea how I can use the CharsetICU (icu4j API) for charset conversion?


There is no need to use external libraries to do this conversion (exception handling omitted):

Reader r = new InputStreamReader(new FileInputStream(...), "IBM937");
Writer w = new OutputStreamWriter(new FileOuputStream(...), "UTF-8");

char[] buf = new char[65536];
int size = 0;

while ((size = r.read(buf)) != -1)
    w.write(buf, 0, size);

r.close();
w.close();


Think you should be able to use CharsetICU.forNameICU("ibm-937") then you can pass the resulting Charset into a reader/writer.


This is NOT a charset conversion, this is a "transliteration" example using ICU library.

Version: ICU4J 53.1

Package: com.ibm.icu.text.Transliterator

Transliterator.getInstance("Latin-ASCII").transliterate("Your text");

Where: "Latin-ASCII" is the "set of characters" you need (IMPORTANT: this is NOT an encoding). You could check the available IDs using Transliterator.getAvailableIDs();

For "Latin-ASCII":

 Given "123" returns "123"
 Given "abc" returns "abc"
 Given "Š Œ ñ" returns "S OE n" 
0

精彩评论

暂无评论...
验证码 换一张
取 消