开发者

How to programmatically logon to a URL, keep the session, and browse around to different pages

开发者 https://www.devze.com 2023-01-27 09:49 出处:网络
I am working on small Java project to programmatically connect to a website with username/password, after login, browse to different links on the site to download some data. First, I need to connect t

I am working on small Java project to programmatically connect to a website with username/password, after login, browse to different links on the site to download some data. First, I need to connect to the website with username/password, second, while I keep the session open, go to other links to download da开发者_Go百科ta.

How do I do this in Java?

Any help will be highly appreciated!


Check out the Apache HTTPClient, it can do all this for you.

Edit: Apache HTTPClient has authentication and cookie handling features included, which will save you a lot of work doing this yourself.


If you want to extract some data HtmlUnit can help you a lot it can manage the authentication and also help you with data extraction.


  1. Investigate with your browser how the web page submits the username/pass data? HTTP Form POST, Ajax, etc..? Use a plugin like Firebug to see network traffic.

  2. You can use URLConnection to create HTTP requests. You will neet to simulate a username/pass login and remember the cookie for use in consequent HTTP requests to simulate a session. Here are some examples: send HTTP POST request, get a cookie, send a cookie.

0

精彩评论

暂无评论...
验证码 换一张
取 消