开发者

Try to parse an not well formed UTF-8 xml file with PHP

开发者 https://www.devze.com 2023-03-12 21:19 出处:网络
I am trying to parse an XML File, but there is one place where the XML file is not well formed. I have try many many converts and stuff, but nothing helps. As first I have try with simplexml, then I h

I am trying to parse an XML File, but there is one place where the XML file is not well formed. I have try many many converts and stuff, but nothing helps. As first I have try with simplexml, then I have try with XMLReader, but I become ever the err开发者_运维百科or: "parser error : Input is not proper UTF-8, indicate encoding ! Bytes: 0x0C 0x41 0x62 0x6F" .

Is there a trick, where I can manipulate the xml content as first, before I put it into simplexml? Or has anyone a better XML parser, who works with not well formed xml strings?

Thanks Nik


I have used DOmDocument with some success:

libxml_use_internal_errors(true);
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->loadHtml($r);
foreach($doc->getElementsByTagName('mytag') as $t) {...etc..}

After you have loaded the doc there are some functions you can call that will try to clean it up, DomDocument

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号