How can I get the contents of a followed link in WWW::Mechanize?_问答_开发者

How can I get the contents of a followed link in WWW::Mechanize?

开发者 https://www.devze.com 2023-01-06 19:49 出处：网络

This is my last question for this I hope. I am using $mech->follow_link to try to download a file. For some reason though the file saved is just the page I first pull up and not the link I want to follow. Is this the correct way I should download the file from the link? I do not want to use wget.

    #!/usr/bin/perl -w
    use strict;
    use LWP;
    use WWW::Mecha开发者_StackOverflow中文版nize;
    my $now_string = localtime;
    my $mech = WWW::Mechanize->new();
    my $filename = join(' ', split(/\W++/, $now_string, -1));
    $mech->credentials( '***********' , '************'); # if you do need to supply     server and realms use credentials like in [LWP doc][2]
$mech->get('http://datawww2.wxc.com/kml/echo/MESH_Max_180min/') or die "Error: failed to load the web page";
$mech->follow_link( url_regex => qr/MESH/i ) or die "Error: failed to download content";
$mech->save_content("$filename.kmz");

Steps to try

First print the contents from your get, to make sure you're reaching a valid HTML page
Make sure the link you're going to is the third link called "MESH" (case-sensitive?)
Print the contents from your second get
Print the filename to make sure it's wellformed
Check that the file was created successfully

Additional

You don't need the unless in either case - it's going to work, or it's going to die

Example

#!/usr/bin/perl -w

use strict;
use WWW::Mechanize;

   sub main{
   
      my $url    =  qq(http://www.kmzlinks.com);
      my $dest   =  qq($ENV{HOME}/Desktop/destfile.kmz);
      
      my $mech   =  WWW::Mechanize->new(autocheck => 1);
      
      # if needed, pass your credentials before this call
      $mech->get($url);
      die "Couldn't fetch page" unless $mech->success;
      
      # find all the links that have urls to kmz files
      my @links  =  $mech->find_all_links( url_regex => qr/(?:\.|%2E)kmz$/i );
      
      foreach my $link (@links){               # (loop example)

         # use absolute URL path of the link to download file to destination
         $mech->get($link->url_abs, ':content_file' => $dest);
     
         last;                                 # only need one (for testing)
      }     
   }
   
   main();

Are you sure you want the 3rd link called 'MESH'?

Change if to unless.