Our XML feed gives us encoded UTF-8 characters inside ISO-8859-1 a file. This is being fed into the database. So the text is ISO-8859-1 encoded and contains following stuff:
金融市场开发者_StackOverflow社区;
Is there a way to convert that into a normal Java string? Similar to:
String str = fromHtmlUtf8("金融市场");
Where resulting str will contain normal UTF8 chars. Chinese in this case, but can be quite mixed.
Thanks.
You can use the StringEscapeUtils from Apache Commons: http://commons.apache.org/lang/api-2.6/org/apache/commons/lang/StringEscapeUtils.html
next time search before: How to convert from HTML to UTF-8 in java
If you need small lib for this, you can use HTMLEntitles
http://www.tecnick.com/public/code/cp_dpage.php?aiocp_dp=htmlentities
精彩评论