开发者

loadXML unhandleable error

开发者 https://www.devze.com 2023-03-04 05:10 出处:网络
I\'m using PEAR XML_Feed_Parser. I have some bad xml that I give to it and get error. DOMDocument::loadXML(): Input is not proper UTF-8, indicate encoding !

I'm using PEAR XML_Feed_Parser. I have some bad xml that I give to it and get error.

DOMDocument::loadXML(): Input is not proper UTF-8, indicate encoding !
Bytes: 0xE8 0xCF 0xD3 0xD4 in Entity, line: 7 

It's actually html in wrong encoding - KOI8-R.

It's ok to get error but I can't handle it!

When I create new XML_Feed_Parser instance with $feed = new XML_Feed_Parser($xml);

it calls to __construct() that looks like that

$this->model = new DOMDocument;
if (! $this->model->loadXML($feed)) {
    if (extension_loaded('tidy') && $tidy) {
        /* tidy stuff */
        }
    } else {
        throw new Exception('Invalid input: this is not valid XML');
}

Where we can see that if loadXML() failed then it throw exception.

I want to catch error from loadXML() to skip bad XMLs and notify user. So i wrapped my code with try-catch like that

try
{
    $feed = new XML_Feed_Parser($xml);
    /* ... */
}
catch(Exception $e)
{
    echo 'Feed invalid: '.$e->getMessage();
    return False;
}

But even after that I get that error

DOMDocument::loadXML(): Input is not proper UTF-8, indicate encoding !
Bytes: 0xE8 0xCF 0xD3 0xD4 in Entity, line: 7 

I've read about loadXML() and found that

If an empty string is passed as the source, a warning will be generated. This warning is not generated by libxml and cannot be handled using libxml's error handling functions.

But somehow instead of warning i get error that halts my application. I've written my error handler and I saw that this is really warning ($errno is 2).

So i see 2 solutions:

  1. Revert warnings to warnings -开发者_开发问答 do not treat them like errors. (Google doesn't help me here). After that handle False returned from loadXML.

  2. Somehow catch that error.

Any help?


libxml_use_internal_errors(true) solved my problem. It made libxml to use normal errors so i can catch False from loadXML().


Try this one:

$this->model = new DOMDocument;
$converted = mb_convert_encoding($feed, 'UTF-8', 'KOI8-R');
if (! $this->model->loadXML($converted)) {
if (extension_loaded('tidy') && $tidy) {
    /* tidy stuff */
    }
} else {
    throw new Exception('Invalid input: this is not valid XML');
}

or you can do it without need to modify XML_Feed_Parser like this:

$xml = mb_convert_encoding($loaded_xml, 'UTF-8', 'KOI8-R');
$feed = new XML_Feed_Parser($xml);
0

精彩评论

暂无评论...
验证码 换一张
取 消