开发者

How to get all the URLs of a website using Crawling process with Asp.net?

开发者 https://www.devze.com 2023-03-12 19:10 出处:网络
How to get all the URLs of a website Suppose I want to crawl some part of data in a website which in different web pages how to get all the url\'s list to get into all those similar pages.

How to get all the URLs of a website

Suppose I want to crawl some part of data in a website which in different web pages how to get all the url's list to get into all those similar pages.

suppose in a mobiles website I want to get all mobiles of one brand how c开发者_高级运维an I get them which are in different URL's of the site. I observe the Div tag class is "brand name" for all the mobiles

Div Class"Nokia" .... I want the URLs of the website which have div class as nokia.


You could use a HTML parser such as Html Agility Pack to extract all urls from anchors, forms, ... If the url is not part of the HTML you are parsing you won't be able (other than guessing) know what all the possible subdomains and urls exist for a given domain.

0

精彩评论

暂无评论...
验证码 换一张
取 消