Unicode and console interpretation_问答_开发者

开发者 https://www.devze.com 2023-01-21 02:34 出处：网络

I print to the standard output some characters from a wide UTF-8 range in a Java application. My console is configured for UTF-8 support. My problem is that sometimes, when I decide to print 10 charac

I think this is due to the console which interprets some characters. Are there some unicode character which can be interpreted like: erase th开发者_运维技巧e previous character ? Is it possible to exclude them from the ouput (what are the codepoints of these characters)?

Using carriage return or the backspace character you can get results like you describe. This little test program for instance...

public class Test {
    public static void main(String... args) {
        System.out.println("abc\rdef\u0008g");
    }
}

...prints in my terminal (ubuntu)

$ java Test
deg
$

\r is carriage return, and \u0008 represents the backspace character. (Carriage return sends the cursor back to the first column, and backspace sends it back one column.)

To remove all these, so called "control characters" you could do:

myString = myString.replaceAll("\\p{Cntrl}", "");

from the docs:

\p{Cntrl} A control character: [\x00-\x1F\x7F]

Obvious one is backspace

Unicode and console interpretation

精彩评论

关注公众号

热门标签

图文推荐

Unicode and console interpretation

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

更多问答相关资讯：