开发者

I want to extract the contents of a node as a string using XPath and PHP

开发者 https://www.devze.com 2023-01-16 10:34 出处:网络
I have a function that accepts a general HTML file and a general XPath expression. I want to extract a string of the matched node containing the entire text including HTML tags.

I have a function that accepts a general HTML file and a general XPath expression. I want to extract a string of the matched node containing the entire text including HTML tags. Here's a simplified example...

<?php
$inDocStg = "
    <html><body>
    <div>The best-laid<br> schemes o' <span>mice</span> an' men
        <img src='./mouse.gif'><br>
    </div>
    </body></html>
    ";

$xPathDom = new DOMDocument();
@$xPathDom->loadHTML( $inDocStg );
$xPath = new DOMXPath( $xPathDom );
$matches = $xPath->query( "//div" );
echo $matches->item(0)->nodeValue;
?>

This produces (I'开发者_开发技巧m looking at the generated HTML source - not the browser output)...

The best-laid schemes o' mice an' men

(the HTML tags have been stripped out).

But what I want is...

The best-laid<br> schemes o' <span>mice</span> an' men<img src='./mouse.gif'><br>

Thanks.


How about you wrap you output arround <pre> tags
echo "<pre>" . $matches->item(0)->nodeValue . "</pre>";


try giving these 2 a go!

1

echo $matches->item(0)->textContent;

2

echo $matches->item(0);

The first one returns the text content of this node and its descendants, and the second one is trying to access the magic method __toString().. depending on how DOMDocument is built it could be the value that your already getting.


This will work but without XPath;

$xPathDom = new DOMDocument();
$xPathDom->loadHTML( $inDocStg );
echo $xPathDom->saveXML($xPathDom->getElementsByTagName('div')->item(0));

or

$xPathDom = new DOMDocument();
$xPathDom->loadHTML( $inDocStg );
$xPathDom->getElementsByTagName('div')->item(0);
echo $xPathDom->saveHTML();
0

精彩评论

暂无评论...
验证码 换一张
取 消