I am having problems with encoding Chinese in an ASP site. The file formats are:
- translations.txt - UTF-8 (to store my translations)
- test.asp - UTF-8 - (to render the page)
test.asp is reading translations.txt that contains the following data:
Help|ZH|帮助
Home|ZH|首页
The test.asp splits on the pipe delimiter and if the user contains a cookie with ZH, it will display this translation, else it will just revert back to the Key value.
Now, I have tried the following things, which have not worked:
Add a meta tag
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
Set the
Response.CharSet = "UTF-8"
- Set the
Response.ContentType = "text/html"
- Set the Session.CodePage (and Response) to both 65001 (UTF-8)
- I have confirmed that the text in
translations.txt
is definitely in UTF-8 and has no byte order mark - The browser is picking up that the page is Unicode UTF-8, but the page is displaying gobbledegook.
- The
Scripting.OpenTextFile(<file>,<create>,<iomode>,<e开发者_如何学Goncoding>)
method returns the same incorrect text regardless of the Encoding parameter.
Here is a sample of what I want to be displayed in China (ZH):
- 首页
- 帮助
But the following is displayed:
- 首页
- 帮助
This occurs all tested browsers - Google Chrome, IE 7/8, and Firefox 4. The font definitely has a Chinese branch of glyphs. Also, I do have Eastern languages installed.
--
I have tried pasting in the original value into the HTML, which did work (but note this is a hard coded value).
- 首页
- 首页
However, this is odd.
首页 --(in hex)--> E9 A6 96 E9 A1 --(as chars)--> 首页
Any ideas what I am missing?
In order to read the UTF-8 file, you'll probably need to use the ADODB.Stream
object. I don't claim to be an expert on character encoding, but this test worked for me:
test.txt (saved as UTF-8 without BOM):
首页
帮助
test.vbs
Option Explicit
Const adTypeText = 2
Const adReadLine = -2
Dim stream : Set stream = CreateObject("ADODB.Stream")
stream.Open
stream.Type = adTypeText
stream.Charset = "UTF-8"
stream.LoadFromFile "test.txt"
Do Until stream.EOS
WScript.Echo stream.ReadText(adReadLine)
Loop
stream.Close
Whatever part of the process is reading the translations.txt
file does not seem to understand that the file is in UTF-8. It looks like it is reading it in as some other encoding. You should specify encoding in whatever process is opening and reading that file. This will be different from the encoding of your web page.
Inserting the byte order mark at the beginning of that file may also be a solution.
Scripting.OpenTextFile
does not understand UTF-8 at all. It can only read the current OEM encoding or Unicode. As you can see from the number of bytes being used for some character sets UTF-8 is quite inefficient. I would recommend Unicode for this sort of data.
You should save the file as Unicode (in Windows parlance) and then open with:
Dim stream : Set stream = Scripting.OpenTextFile(yourFilePath, 1, false, -1)
Just use the script below at the top of your page
Response.CodePage=65001
Response.CharSet="UTF-8"
精彩评论