开发者

Automatically downloading files from a specific website

开发者 https://www.devze.com 2023-02-09 18:39 出处:网络
I am a very new programmer.. A website is providing a lot of zip files that i needed. It w开发者_运维问答ill be updated/uploaded new zip files weekly. What I need to do is write a program/script to do

I am a very new programmer.. A website is providing a lot of zip files that i needed. It w开发者_运维问答ill be updated/uploaded new zip files weekly. What I need to do is write a program/script to do auto downloading from the web weekly.. for example, this is the web link http://www.google.com/googlebooks/uspto-patents-applications-yellowbook.html ( you can see a lot of zip files there )

so my question is

  1. What script i have to write(i got no experience in writing any script, so what can you suggest?) so i can download the zip file programmatically?

  2. If the 1st questioned solved, then how should i make it to download the new zip file uploaded weekly?

Is it i have to use DOM...unix? if yes, i will do some research on tat to make it work.


Why wget? You can use HtmlAgilityPack to parse the website and extract all links. Then you simply loop over all urls and download the file, using C# all the way through. You can also open a wget process from c# if you wish to do so.

On the other hand, this can easily be done using bash and sed/awk and grep in combination with wget.

Either way you will still need cron to schedule the job on a weekly basis.

WebClient Client = new WebClient ();
Client.DownloadFile("http://www.csharpfriends.com/Members/index.aspx", "index.aspx");


I've also used JSoup (http://jsoup.org/) very effectively in Java/Scala applications to scrape data out of web pages.


If you are on Linux/UNIX, use 'wget' in a script for downloading the files, and 'cron' to schedule the downloading script.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号