开发者

Get Mechanize to handle cookies from an arbitrary POST (to log into a website programmatically)

开发者 https://www.devze.com 2022-12-25 22:24 出处:网络
I want to log into https://www.t-mobile.com/ programmatically. My first idea was to use Mechanize to submit the login form:

I want to log into https://www.t-mobile.com/ programmatically. My first idea was to use Mechanize to submit the login form:

alt text http://dl.dropbox.com/u/2792776/screenshots/2010-04-08_1440.png

However, it turns out that this isn't even a real form. Instead, when you click "Log in" some javascript grabs the values of the fields, creates a new form dynamically, and submits it.

"Log in" button HTML:

<button onclick="handleLogin(); return false;" class="btnBlue" id="myTMobile-login"><span>Log in</span></button>

The handleLogin() function:

function handleLogin() {
    if (ValidateMsisdnPassword()) { // client-side form validation logic
        var a = document.createElement("FORM");
        a.name = "form1";
        a.method = "POST";
        a.action = mytmoUrl; // defined elsewhere as https://my.t-mobile.com/Login/LoginController.aspx
        var c = document.createElement("INPUT");
        c.type = "HIDDEN";
        c.value = document.getElementById("myTMobile-phone").value; // the value of the phone number input field
        c.name = "txtMSISDN";
        a.appendChild(c);
        var b = document.createElement("INPUT");
        b.type = "HIDDEN";
        b.value = document.getElementById("myTMobile-password").value; // the value of the password input field
        b.name = "txtPassword";
        a.appendChild(b);
        document.body.appendChild(a);
        a.submit();
        return true
    } else {
        return false
    }
}

I could simulate this开发者_如何学运维 form submission by POSTing the form data to https://my.t-mobile.com/Login/LoginController.aspx with Net::HTTP#post_form, but I don't know how to get the resultant cookie into Mechanize so I can continue to scrape the UI available when I'm logged in.

Any ideas?


You can use something like this to login and save the cookie so you won't have to do it again. Of course you will need to come up with your own logic to post it directly but this is how I use Mechanize's built in cookie_jar method to save cookies.

if !agent.cookie_jar.load('cookies.yml')
  page = agent.get('http://site.com')

  form = page.forms.last
  form.email = 'email'
  form.password = 'password'

  page = agent.submit(form)

  agent.cookie_jar.save_as('cookies.yml')
end


I would avoid Net::HTTP; try with:

post(url, query={}, headers={})

directly from Mechanize class.


I often use the FireFox HttpFox extension to figure out what exactly is going on for these kind of problems.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号