开发者

PHP Tidy and Character Encoding

开发者 https://www.devze.com 2023-03-29 05:12 出处:网络
I am making use of PHP tidy like so: $config = array( \'wrap\'=> 0, \'lower-literals\'=> 1, \'preserve-entities\'=> 1,

I am making use of PHP tidy like so:

$config = array(
                'wrap'                         => 0,
                'lower-literals'               => 1,
                'preserve-entities'            => 1,
                'drop-empty-paras'             => 0
                );

$tidy = new tidy;
开发者_JAVA百科
$tidy->parseString($html, $config, 'utf8');

$tidy->cleanRepair();

When I pass in HTML with English text it comes out fine. However, French text, and it has trouble with the encoding. So if I pass something like vérifier then it appears as vérifier in the output. How can I get tidy to play nice with all languages, at least latin ones.

In addition, I will be passing the output of tidy through to PHP's DOM Document, anything I should be careful with here?


It looks very much like the UTF-8 handling is working fine, but you're interpreting the result in latin-1 instead of UTF-8. Set an appropriate HTTP header or meta tag instructing the browser to read the document using UTF-8.

header('Content-Type:text/html; charset=utf-8');
0

精彩评论

暂无评论...
验证码 换一张
取 消