开发者

not used characters from ANSI characters set

开发者 https://www.devze.com 2023-02-16 15:47 出处:网络
I\'am developing a small programming language together with an IDE. The ANSI character set states the subset of unused characters. Here is the complete list: 0x7F, 0x81, 0x8D, 0x8F, 0x90, 0x9D

I'am developing a small programming language together with an IDE.

The ANSI character set states the subset of unused characters. Here is the complete list: 0x7F, 0x81, 0x8D, 0x8F, 0x90, 0x9D

I'd like to use some of them for an invisible code markup, so am wondering how they got printed in different environments. Can I assume they are always a whitespace, or some editors will take the honor to rep开发者_StackOverflowlace them with something like '?' or grey rectangle?

Thank you, Dmitry


You seem to be talking about Windows-1252, which is just one of many "ANSI" code pages Windows can use, and it's probably not used outside of Windows. Don't tie a new product to an obsolete technology. Not supporting Unicode (be it UTF-16le or UTF-8) is unacceptable for a programming language.

While it's rather moot to answer the direct question, the answer is no, you cannot assume they will be treated as whitespace. Some may. Some may replace with a space. Some may replace with another glyph. Some may use special colours. Some may give a warning. Some may not load the file.

By the way, if you are referring to Windows-1252, only 0x81, 0x8D, 0x8F, 0x90, 0x9D aren't assigned.


You shouldn't assume any specific behavior, as it will depend on the display widget and quite possibly on the font. Either preprocess the text for display or use an out-of-band markup mechanism (for example, many text field widgets let you attach attributes to runs of text).

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号