开发者

How can I set ASCII encoding for my XHTML document?

开发者 https://www.devze.com 2023-03-13 18:28 出处:网络
I realize UTF-8 is the standard but I have reasons for wanting ASCII.I found good reference which states that 50% of web sites use UTF-8 and a very small amount of sites use UTF-16.

I realize UTF-8 is the standard but I have reasons for wanting ASCII. I found good reference which states that 50% of web sites use UTF-8 and a very small amount of sites use UTF-16.

http://www.w3.org/International/questions/qa-html-encoding-declarations#httpheadwhat

But I only use the ASCII character set so I'd like my pages to interpret/parse that way. Plus I dont' want to guess if there is a BOB is being used. W开发者_Python百科ith ASCII, my understanding there is not.

How can I set ASCII encoding for my XHTML document?


Since ASCII is a proper subset of UTF-8, you can blissfully declare UTF-8 encoding and it will make no difference.

Indeed it is probably better than specifying ANSI_X3.4-1968 or US-ASCII as defined by the IANA as it is reasonable to expect that may be deprecated someday (or one can hope).


I suspect that you are not using ASCII for anything. ASCII is a 7-bit encoding developed in the 1960's. Most tool sets today do not restrict their input to 7-bits. I suspect that whatever legacy tools you are using that require a single-byte character set are actually using ISO-8859-1, or some other similar legacy character set (like CP-1252 or DOS code page 437).

If that is the case, then presenting the file as ASCII is an error, and it will lead to rendering problems in the future.

In any case, I highly recommend that you update your tool-chain to use Unicode.

Unicode is the basis of XML, which is the basis of XHTML. Unicode is the native string format of Windows, the .NET Framework, Linux, iOS, and every other software platform developed in the last 20 years. Unicode is the primary encoding of the web.

Any browser will have to translate your non-Unicode page into Unicode before it is displayed anyway.

Legacy character encodings anywhere in your system are a burden to maintain. They are a tax on your system that has to be paid at every interface for every modification. They are a bug-factory.

Unicode allows you to send text from anywhere to anywhere without worrying about how the text is going to be screwed up somewhere along the line.

The 20th Century (and not the 21st) is the proper place for world wars, smallpox, and legacy character encodings. Those things should be left there in the past where they belong.

Conversion to Unicode is the change that you need to make. And you can make that change. Western Union doesn't send telegrams any more. Rotary phones are rare. The future is here now, and its name is Unicode!

(Also, you should take Dr. Strangelove's advice and "learn to stop worrying and love the BOM." :D )


Yes, you can. You just need to represent any non-ASCII character with numeric character references (e.g., € instead of ).

0

精彩评论

暂无评论...
验证码 换一张
取 消