I have a function that accepts a general HTML file and a general XPath expression. I want to extract a string of the matched node containing the entire text including HTML tags. Here's a simplified example...
<?php
$inDocStg = "
<html><body>
<div>The best-laid<br> schemes o' <span>mice</span> an' men
<img src='./mouse.gif'><br>
</div>
</body></html>
";
$xPathDom = new DOMDocument();
@$xPathDom->loadHTML( $inDocStg );
$xPath = new DOMXPath( $xPathDom );
$matches = $xPath->query( "//div" );
echo $matches->item(0)->nodeValue;
?>
This produces (I'开发者_开发技巧m looking at the generated HTML source - not the browser output)...
The best-laid schemes o' mice an' men
(the HTML tags have been stripped out).
But what I want is...
The best-laid<br> schemes o' <span>mice</span> an' men<img src='./mouse.gif'><br>
Thanks.
How about you wrap you output arround <pre>
tags echo "<pre>" . $matches->item(0)->nodeValue . "</pre>";
try giving these 2 a go!
1
echo $matches->item(0)->textContent;
2
echo $matches->item(0);
The first one returns the text content of this node and its descendants, and the second one is trying to access the magic method __toString()
.. depending on how DOMDocument is built it could be the value that your already getting.
This will work but without XPath;
$xPathDom = new DOMDocument();
$xPathDom->loadHTML( $inDocStg );
echo $xPathDom->saveXML($xPathDom->getElementsByTagName('div')->item(0));
or
$xPathDom = new DOMDocument();
$xPathDom->loadHTML( $inDocStg );
$xPathDom->getElementsByTagName('div')->item(0);
echo $xPathDom->saveHTML();
精彩评论