开发者

Automated download of website content using ASP.net

开发者 https://www.devze.com 2022-12-24 18:27 出处:网络
Usin开发者_运维技巧g ASP.net, what methods can I use to do the following: Open up a connection to a given URL to read HTML content

Usin开发者_运维技巧g ASP.net, what methods can I use to do the following:

  1. Open up a connection to a given URL to read HTML content
  2. Parse the given URL for hyperlinks, and place them in an array
  3. Loop through each hyperlink (only 1 level down), opening each one, saving the HTML contents in a table, and move to the next hyperlink until done.

If ASP.net is not up to the task, other languages or free scripts/toolkits would be acceptable.

Thanks.


  • Use a System.Net.WebClient for step 1.
  • Use System.Text.RegularExpressions as shown here for step 2.
  • Create and use a System.Data.DataTable for step 2.
  • See here for step 3

I left out the obvious things, such as "loop through the DataTable", etc. A more in-depth answer is probably not something that will be coming from this site. The question is a bit too big to answer completely here.

0

精彩评论

暂无评论...
验证码 换一张
取 消