I need to conve开发者_开发问答rt a file from EBCDIC (IBM 937) to UTF-8. Any idea how I can use the CharsetICU (icu4j API) for charset conversion?
There is no need to use external libraries to do this conversion (exception handling omitted):
Reader r = new InputStreamReader(new FileInputStream(...), "IBM937");
Writer w = new OutputStreamWriter(new FileOuputStream(...), "UTF-8");
char[] buf = new char[65536];
int size = 0;
while ((size = r.read(buf)) != -1)
w.write(buf, 0, size);
r.close();
w.close();
Think you should be able to use CharsetICU.forNameICU("ibm-937") then you can pass the resulting Charset into a reader/writer.
This is NOT a charset conversion, this is a "transliteration" example using ICU library.
Version: ICU4J 53.1
Package: com.ibm.icu.text.Transliterator
Transliterator.getInstance("Latin-ASCII").transliterate("Your text");
Where: "Latin-ASCII" is the "set of characters" you need (IMPORTANT: this is NOT an encoding). You could check the available IDs using Transliterator.getAvailableIDs();
For "Latin-ASCII":
Given "123" returns "123"
Given "abc" returns "abc"
Given "Š Œ ñ" returns "S OE n"
精彩评论