Removing cdata in simplehtmldom_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-04-06 21:04 出处：网络

Hello good day I am trying to scrape an xml feed that was given to us, I am using simple htmldom to scrape it but some contents have cdata, how can I remove it?

相关专题：php

Hello good day I am trying to scrape an xml feed that was given to us, I am using simple htmldom to scrape it but some contents have cdata, how can I remove it?

<date>
<weekday>
<![CDATA[ Friday
]]> 
</weekday>
</date>

php

<?php     
<?php 
include('simple_html_dom.php'); 
include ('phpQuery.php'); 
if (ini_get('allow_url_fopen'))
$xml  = file_get_html('http://www.link.com/url.xml'); }
else{       $ch = curl_init('http://www.link.com/url.xml');
curl_setopt  ($ch, CURLOPT_HEADER, false);        
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);           
$src = curl_exec($ch);           
$xml = str_get_html($src, false);  }   
?>
<?php 
foreach($xml->find('weekday') as $e)
echo $e->innertext  . '<br>';
?>

I believe by default simplehtmldom remove开发者_高级运维s the cdata but for some reason it doesn't work.

Kindly tell me if you need any info that would be helpful to solve this issue

Thank you so much for your help

You can make use of another xml parser that is able to convert cdata into a string (Demo):

$innerText = '<![CDATA[ Friday
]]>';

$innerText = (string) simplexml_load_string("<x>$innerText</x>"));

Extended code-example based on OP's code

# [...]
<?php 
foreach($xml->find('weekday') as $e)
{
    $innerText = $e->innertext;
    $innerText = (string) simplexml_load_string("<x>$innerText</x>");
    echo $innerText . '<br>';
}
?>

Usage instructions: Locate the line which contains the foreach and then compare the original code with the new code (only the foreach in question has been replaced).

I agree with the other answer - just allow CDATA to be shown. I'd recommend simpleXML

$xml = simplexml_load_file('test.xml', 'SimpleXMLElement', LIBXML_NOCDATA);
echo '<pre>', print_r($xml), '</pre>';

LIBXML_NOCDATA is important - keep that in there.

Removing cdata in simplehtmldom

精彩评论

关注公众号

热门标签

图文推荐

Removing cdata in simplehtmldom

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

更多问答相关资讯：