I'm using th开发者_如何学Pythonis example to fetch links from a website :
http://www.merchantos.com/makebeta/php/scraping-links-with-php/
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");
for ($i = 0; $i < $hrefs->length; $i++) {
$href = $hrefs->item($i);
var_dump($href);
$url = $href->getAttribute('href');
echo "<br />Link stored: $url";
}
It works well; getting all the links; but I cannot get the actual 'title' of the link; for example if i have :
<a href="www.google.com">Google</a>
I want to be able to fetch 'Google' term too.
I'm little lost and quite new to xpath.
You are looking for the "nodeValue" of the Textnode inside the "a" node. You can get that value with
$title = $href->firstChild->nodeValue;
Full working example:
<?php
$dom = DomDocument::loadHTML("<html><body><a href='www.test.de'>DONE</a></body></html>");
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");
for ($i = 0; $i < $hrefs->length; $i++) {
$href = $hrefs->item($i);
$url = $href->getAttribute('href');
$title = $href->firstChild->nodeValue;
echo "<br />Link stored: $url $title";
}
Prints:
Link stored: www.test.de DONE
Try this:
$link_title = $href->nodeValue;
精彩评论