I got an unicode string from an external server like this:
005400610020007400650020007400ED0020007400FA0020003F0020003A0029
and I have to decode it using java. I know that the '\u' prefix make the magic (i.e. '\u0054' -> 'T'), but I don't know how transform it to use as a common string.
Thanks in advance.
Edit: Thanks to开发者_运维技巧 everybody. All the answers work, but I had to choose only one :(
Again, thanks.
It looks like a UTF-16 encoding. Here is a method to transform it:
public static String decode(String hexCodes, String encoding) throws UnsupportedEncodingException {
if (hexCodes.length() % 2 != 0)
throw new IllegalArgumentException("Illegal input length");
byte[] bytes = new byte[hexCodes.length() / 2];
for (int i = 0; i < bytes.length; i++)
bytes[i] = (byte) Integer.parseInt(hexCodes.substring(2 * i, 2 * i + 2), 16);
return new String(bytes, encoding);
}
public static void main(String[] args) throws UnsupportedEncodingException {
String hexCodes = "005400610020007400650020007400ED0020007400FA0020003F0020003A0029";
System.out.println(decode(hexCodes, "UTF-16"));
}
}
Your example returns "Ta te tí tú ? :)"
You can simply split the String in Strings of length 4 and then use Integer.parseInt(s, 16)
to get the numeric value. Cast that to a char
and build a String out of it. For the above example you will get:
Ta te tí tú ? :)
It can be interpreted as UTF-16 or as UCS2 (a sequence of codepoints coded in 2 bytes, hexadecimal representation), it's equivalent as long as we do not fall outside the BMP. An alternative parsing method:
public static String mydecode(String hexCode) {
StringBuilder sb = new StringBuilder();
for(int i=0;i<hexCode.length();i+=4)
sb.append((char)Integer.parseInt(hexCode.substring(i,i+4),16));
return sb.toString();
}
public static void main(String[] args) {
String hexCodes = "005400610020007400650020007400ED0020007400FA0020003F0020003A0029";
System.out.println(mydecode(hexCodes));
}
精彩评论