Automatically send and receive data from a HTML page_问答_开发者

Automatically send and receive data from a HTML page

开发者 https://www.devze.com 2023-02-18 09:13 出处：网络

I am not sure which module I am supposed to use for this. I have >100 files I need to submit to the following webpage and to retrieve the results.

相关专题：python

I am not sure which module I am supposed to use for this. I have >100 files I need to submit to the following webpage and to retrieve the results.

http://bip.weizmann.ac.il/oca-bin/lpccsu

It would be beneficial if I could automate the process somehow 开发者_Go百科sending the file to the

'<'input type="file" name="filename"  size='30''>'

tag, and then receive the returned html so that it can be processed with regular expressions.

Thanks

edit to see an example output, set the radiobutton to CSU, and enter 1eo8 in the 'PDB entry' textbox

@Anake Here are 3 Pythonic packages that provide a solution for retrieving and parsing:

From their websites:

Beautiful Soup parses anything you give it, and does the tree traversal stuff for you. You can tell it "Find all the links", or "Find all the links of class externalLink", or "Find all the links whose urls match "foo.com", or "Find the table heading that's got bold text, then give me that text." 1

Stateful programmatic web browsing in Python, after Andy Lester’s Perl module 2

Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing. 3

There are a few ways to do this:

1) Perl and LWP

use LWP::UserAgent; 
my $ua = new LWP::UserAgent; 

my $response 
= $ua->post('http://bip.weizmann.ac.il/oca-bin/lpccsu?9955', 
{ param1 => 'value1',
param2 => 'value2', 
}); 

my $content = $response->content;
// your regular expression code

2) Autohotkey, which has regular expressions and a library written by a user that handles POST requests, see http://www.autohotkey.com/forum/topic33506.html

3) Write a batch file that uses wget --post-data and --post-file, pipe it to a series of files, and read the output with your favortite script language Reference: http://www.gnu.org/software/wget/manual/html_node/HTTP-Options.html

Hope that helps