开发者

Looking for a way to scrape urls from a page and output it to a text file

开发者 https://www.devze.com 2023-01-20 17:02 出处:网络
I am looking for a way to scrape URLS from a web page and output it to a text file. E.g开发者_如何学编程 if a page contains multiple http://example.com/articleI want to grab both these URLS and outpu

I am looking for a way to scrape URLS from a web page and output it to a text file. E.g开发者_如何学编程 if a page contains multiple http://example.com/article I want to grab both these URLS and output it to a text file.


Have a look at WWW::Mechanize.

Example code:

use strict;
use warnings;
use 5.010;

use WWW::Mechanize;

my $mech = WWW::Mechanize->new();
$mech->get('http://example.com/example');
foreach my $link ($mech->find_all_links()) {
    say $link->url_abs();
}


Use HTML::SimpleLinkExtor:

use strict;
use warnings;

use HTML::SimpleLinkExtor;

my $extor = HTML::SimpleLinkExtor->new();
$extor->parse_url('http://example.com/article');
my @links = $extor->absolute_links();
0

精彩评论

暂无评论...
验证码 换一张
取 消