I have following html:
<html ><body >Body text <div >div content</div></body></html>
How could I get content开发者_运维百科 of body without nested <div>
?
I need to get 'Body text', but do not have a clue how to do this.
result of running
$domhtml = DOMDocument::loadHTML($html);
print $domhtml->getElementsByTagName('body')->item(0)->nodeValue;
is 'Body textdiv content', which is not exactly what I want to get
I prefer DOMXPath for problems like this. It's very flexible
$domhtml = DOMDocument::loadHTML($html);
$xpath = new DOMXPath($domhtml);
$query="/html/body/text()"; //gets all text nodes that are direct children of body
$txtnodes = $xpath->query($query);
foreach ($txtnodes as $txt) {
echo $txt->nodeValue;
}
$domhtml = DOMDocument::loadHTML($html);
print $domhtml->getElementsByTagName('body')->item(0)->textContent;
Based on the comments from php.net This should work for you:
$domhtml = DOMDocument::loadHTML($html);
print $domhtml->getElementsByTagName('body')->firstChild->nodeValue;
精彩评论