I am more than just confused. I do have some EditText, and it apparently returns ISO-8859-1 or even mixed 8859-1+UTF8 strings.
My understanding until now was, that Android is fully UTF-8, so this can't even happen.
Examples: Inputting "wüste" into EditText. A string to hex returns this byte code: 57 fc 73 74 65, my expectation would be: 57 c3bc 73 74 65
Inputting "wüste テスト" returns 57 fc 73 74 65 20 30c6 30b9 3开发者_JS百科0c8, which now even is a mix of extended 8859-1 and UTF-8.
Is this the expected and wanted behaviour? Can I change that somewhere? I realized this behaviour when sending data using JSON to a server, and that one bailed out because of illegal UTF-8 chars.
Regards, Oliver
Java (and therefore Android) strings are not UTF-8, but UTF-16. The bytes displayed are Unicode code points.
You'll need to convert your string to UTF-8 in order to send it as such (either directly, or via any JSON library you might be using). This can be done by calling getBytes("UTF8")
on your string to get a byte array with the string in the desired encoding.
精彩评论