开发者

What method should I employ to extract keywords from a URL?

开发者 https://www.devze.com 2023-02-11 18:34 出处:网络
I am working on extraction of keywords. The system ta开发者_JS百科kes a URL as input and the output is supposed to be keywords describing the contents of the URL. We are considering only textual parts

I am working on extraction of keywords. The system ta开发者_JS百科kes a URL as input and the output is supposed to be keywords describing the contents of the URL. We are considering only textual parts now. I would like to know what methods I can employ for extracting keywords from URLs and how they compare with each other. Suggestions and redirections are welcome.


i think you can use this method

read the site with urllib ( http://docs.python.org/library/urllib2.html?highlight=urllib2#module-urllib2 ) and then remove tags and create plane text of site

then check which word are used more. then create top tens ( or count )

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号