Xpath php fetch links_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-01-08 06:48 出处：网络

I\'m using th开发者_如何学Pythonis example to fetch links from a website : http://www.merchantos.com/makebeta/php/scraping-links-with-php/

相关专题：domxpath php

I'm using th开发者_如何学Pythonis example to fetch links from a website :

http://www.merchantos.com/makebeta/php/scraping-links-with-php/

$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");

for ($i = 0; $i < $hrefs->length; $i++) {
    $href = $hrefs->item($i);
    var_dump($href);
    $url = $href->getAttribute('href');
    echo "<br />Link stored: $url";
}

It works well; getting all the links; but I cannot get the actual 'title' of the link; for example if i have :

<a href="www.google.com">Google</a>

I want to be able to fetch 'Google' term too.

I'm little lost and quite new to xpath.

You are looking for the "nodeValue" of the Textnode inside the "a" node. You can get that value with

$title = $href->firstChild->nodeValue;

Full working example:

<?php
$dom = DomDocument::loadHTML("<html><body><a href='www.test.de'>DONE</a></body></html>");

$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");

for ($i = 0; $i < $hrefs->length; $i++) {
    $href = $hrefs->item($i);
    $url = $href->getAttribute('href');
    $title = $href->firstChild->nodeValue;
    echo "<br />Link stored: $url $title";
}

Prints: