开发者

How do I get the text-form verification code when doing auto site access in Perl?

开发者 https://www.devze.com 2022-12-21 00:45 出处:网络
I\'m playing around with Win32::IE:Mechanize to try to access some authentication-required sites automatically. So far I\'ve achieved moderate success, for example, I can automatically log in to my ya

I'm playing around with Win32::IE:Mechanize to try to access some authentication-required sites automatically. So far I've achieved moderate success, for example, I can automatically log in to my yahoo mailbox. But I find many sites are using some kind of image verification mechanism, which is possibly called CAPTCHA. I can do nothing to them. But one of the sites I'm trying to auto access is using a plain-text verification code. It is comnposed of four digits, selectable and copyable. But they're not in the source file which can be fetched using

$mech->content;

I searched for the keyword that appears on the webpage but not in the source file through all the files in the Temporary Internet Files but still can't find it.

Any idea what's going on? I was suspecting that the verification code was somehow hidden in some cookie file but I can't seem to find it :(

The following is the code that completes all the fields开发者_StackOverflow社区 requirements except for the verification code:

use warnings;
use Win32::IE::Mechanize;

my $url = "http://www.zjsmap.com/smap/smap_login.jsp";
my $eccode = "myeccode";
my $username = "myaccountname";
my $password = "mypassword";
my $verify = "I can't figure out how to let the script get the code yet"

my $mech = Win32::IE::Mechanize->new(visible=>1);
$mech->get($url);
sleep(1); #avoids undefined value error
$mech->form_name("BaseForm");
$mech->field(ECCODE => $eccode);
$mech->field(MEMBERACCOUNT => $username);
$mech->field(PASSWORD => $password);
$mech->field(verify => $verify);
$mech->click();

Like always any suggestions/comments would be greatly appreciated :)

UPDATE

I've figured out a not-so-smart way to solve this problem. Please comment on my own asnwer posted below. Thanks like always :)


This is the reason why they are there. To stop program like yours to do automated stuff ;-)

A CAPTCHA or Captcha is a type of challenge-response test used in computing to ensure that the response is not generated by a computer.


This appears to be an irrelevant number. The page uses it in 3 places: generating it; displaying it on the form next to the input field for it; and checking for the input value being equal to the random number chosen. That is, it is a client-only check. Still, if you disable javascript it looks like, I'm guessing, important cookies don't get set. If you can execute JavaScript in the context of the page (you should be able to with a get method call and a javascript URI), you could change the value of random_number to f.e. 42 and fill that in on the form.


The code is inserted by JavaScript – disable JS, reload the page and see it disappear. You have to hunt through the JS code to get an idea where it comes from and how to replicate it.


Thanks to james2vegas, zoul and Shoban.

I've finally figured out on my own a not-so-smart but at-least-workable way to solve the problem I described here. I'd like to share it here. I think the approach suggested by @james2vegas is probably much better...but anyway I'm learning along the way.

My approach is this:

Although the verification code is not in the source file but since it is still selectable and copyable, I can let my script copy everything in the login page and then extract the verification code.

To do this, I use the sendkeys functions in the Win32::Guitest module to do "Select All" and "Copy" to the login page.

Then I use Win32:Clipboard to get the clipboard content and then Regexp to extract the code. Something like this:

$verify = Win32::Clipboard::GetText();
$verify =~ s/.* (\d{4}).*/$1/msg;

A few thoughts:

The random number is generated by something like this in Perl my $random_number = int(rand(8999)) + 1000; #var random_number = rand(1000,10000); And then it checks if $verify == $random_number. I don't know how to catch the value of one-session-only $random_number. I think it is stored somewhere in the memory. If I can capture the value directly then I wouldn't have gone to so much trouble of using this and that extra module.

0

精彩评论

暂无评论...
验证码 换一张
取 消