开发者

how can i fetch all images src into array with file get content

开发者 https://www.devze.com 2023-01-14 14:03 出处:网络
How can I fetch all images src开发者_如何学JAVA into array with file_get_content(), with preg_match or whatever?You shouldn\'t use regex to parse HTML. You should use classes like DOMDocument to do so

How can I fetch all images src开发者_如何学JAVA into array with file_get_content(), with preg_match or whatever?


You shouldn't use regex to parse HTML. You should use classes like DOMDocument to do so. DOMDocument has the getElementsByTagName method that can be used to retrieve all the img tag from the document you want to parse.

Here's an example that will echo the list of the images in the document :

<?php
    $document = new DOMDocument();
    $document->loadHTML(file_get_contents('yourfilehere.html'));
    $lst = $document->getElementsByTagName('img');

    for ($i=0; $i<$lst->length; $i++) {
        $image = $lst->item($i);
        echo $image->attributes->getNamedItem('src')->value, '<br />';
    }
?>


It's more reliable and simpler to use phpQuery or SimpleHTMLparser (more elaborate). But for basic extraction purposes, and just searching for src= attributes, this is overkill and an regular expression is in fact sufficient:

preg_match_all('/<img[^>]+src\s*=[\'\"\s]?([^<\'\"]+)/ims', file_get_contents($url), $uu);

Note that it will yield relative path names, mostly not URLs. So needs postprocessing, whereas phpQuery IIRC has a shortcut for normalizing them.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号