开发者

dis.readchar Chinese letters - wrongly interpreted Characters!

开发者 https://www.devze.com 2023-01-28 02:31 出处:网络
I want to read a file into an ArrayList of Characters. At first I thought this might be a pretty slick way doing it:

I want to read a file into an ArrayList of Characters. At first I thought this might be a pretty slick way doing it:

ArrayList<Character> char_chain = new ArrayList<Character>();


try {
            fis = new FileInputStream(file);
            bis = new BufferedInputStream(fis);
            dis = new DataInputStream(bis);

            while (dis.available() != 0) {
                // UTF8 unnoetig, da 26 Lettern
                while (!EOF) {
                    try {
                        char_chain.add(dis.readChar());
                    } catch (EOFException e) {
                        EOF = true;
                    }
                }
            }
        if (debug) {
            while (char_chain.get(i) instanceof Character) {
                System.out.println(char_chain.get(i++));
            }
        }

If I do this开发者_如何学C I get Chinese letters:

噖
䝃
塘
䕅

Could someone tell me why that is? :) I should mention that the text contains regular upper-case letters like: ABCDE and so on.


DataInputStream.readChar() assumes that you are reading UTF-16 characters.

To read character data, use an InputStreamReader with the correct encoding ("US-ASCII" should be sufficient if the file only contains basic latin letters).

0

精彩评论

暂无评论...
验证码 换一张
取 消