开发者

IWebBrowser: How to specify the encoding when loading html from a stream?

开发者 https://www.devze.com 2022-12-26 01:09 出处:网络
Using the concepts from the sample code provided by Microsoft for loading HTML content into an IWebBrowser from an IStream using the web browser\'s IPers开发者_运维百科istStreamInit interface:

Using the concepts from the sample code provided by Microsoft for loading HTML content into an IWebBrowser from an IStream using the web browser's IPers开发者_运维百科istStreamInit interface:

pseudocode:

void LoadWebBrowserFromStream(IWebBrowser webBrowser, IStream stream)
{
   IPersistStreamInit persist = webBrowser.Document as IPersistStreamInit;
   persist.Load(stream);
}

How can one specify the encoding of the html inside the IStream? The IStream will contain a series of bytes, but the problem is what do those bytes represent? They could, for example, contain bytes where:

  • each byte represents a character from the current Windows code-page (e.g. 1252)
  • each byte could represent a character from the ISO-8859-1 character set
  • the bytes could represent UTF-8 encoded characters
  • every 2 bytes could represent a character, using UTF-16 encoding

In my particular case, i am providing the IWebBrowser an IStream that contains a series of double-bytes characters (UTF-16), but the browser (incorrectly) believes that UTF-8 encoding is in effect. This results in garbled characters.

Workaround solution

While the question asks how to specify the encoding, in my particular case, with only UTF-16 encoding, there's a simple workaround. Adding the 0xFEFF Byte Order Mark (BOM) indicates that the text is UTF-16 unicode. ie then uses the proper encoding and shows the text properly.

Of course that wouldn't work if the text were encoded, for example with:

  • UCS-2
  • UCS-4
  • ISO-10646-UCS-2
  • UNICODE-1-1-UTF-8
  • UNICODE-2-0-UTF-16
  • UNICODE-2-0-UTF-8
  • US-ASCII
  • ISO-8859-1
  • ISO-8859-2
  • ISO-8859-3
  • ISO-8859-4
  • ISO-8859-5
  • ISO-8859-6
  • ISO-8859-7
  • ISO-8859-8
  • ISO-8859-9
  • WINDOWS-1250
  • WINDOWS-1251
  • WINDOWS-1252
  • WINDOWS-1253
  • WINDOWS-1254
  • WINDOWS-1255
  • WINDOWS-1256
  • WINDOWS-1257
  • WINDOWS-1258


IE's document supports IPersistMoniker loading too. IE uses URL monikers for downloading. You can replace the url moniker created by CreateURLMonikerEx with your own moniker. A few details about URL moniker's implementation can be find here. See if you can get IHTTPNegotiate from the binding context when your BindToStroage implemetation is called.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号