开发者

How to automate the process of doing data-entry

开发者 https://www.devze.com 2023-02-27 04:32 出处:网络
I have a situation where I need to visit 100 odd websites to collect contact information and then enter this in my own site. What I want to know is if its possi开发者_运维百科ble to write a program or

I have a situation where I need to visit 100 odd websites to collect contact information and then enter this in my own site. What I want to know is if its possi开发者_运维百科ble to write a program or a crawler, if I'm putting it correctly, to get all this information. I'm guessing the information will be available in unstructured html and then I'll have to do parsing to make it structured.Has anyone had any similar experience of doing this. Also would like opinions on the language to use.


You're looking for a Web Scraper. A few Google searches should turn up various free and commercial products that would solve your problem. You probably don't need to write one yourself if the data you're collecting is fairly simple and well structured.


Try ruby ( mechanize lib):

http://mechanize.rubyforge.org/mechanize/GUIDE_rdoc.html

as example:

agent.get('http://someurl.com/').search(".//p[@class='posted']")
0

精彩评论

暂无评论...
验证码 换一张
取 消