开发者

python save url list in txt file

开发者 https://www.devze.com 2023-03-09 22:55 出处:网络
Hello I am trying to make a python function to save a list of URLs in .txt file Example: visit http://forum.domain.com/ and save all viewtopic.php?t= word URL in .txt file

Hello I am trying to make a python function to save a list of URLs in .txt file

Example: visit http://forum.domain.com/ and save all viewtopic.php?t= word URL in .txt file

http://forum.domain.com/viewtopic.php?t=1333
http://forum.domain.com/viewtopic.php?t=2333

I use this function but not save I am very new in python can someone help me to create this

web_obj = opener.open('http://forum.domain.com/')
data = web_obj.read()

fl_url_list = open('urllist.txt', 'r')
url_arr = fl开发者_Python百科_url_list.readlines()
fl_url_list.close()


This is far from trivial and can have quite a few corner cases (I suppose the page you're referring to is a web page)

To give you a few pointers, you need to:

  • download the web page : you're already doing it (in data)
  • extract the URLs : this is hard, most probably, you'll want to usae an html parser, extract <a> tags, fetch the hrefattribute and put that into a list. then filter that list to have only the url formatted like you like (say with viewtopic). Let's say you got it into urlList
  • then open a file for Writing Text (thus wt, not r).
  • write the content f.write('\n'.join(urlList))
  • close the file

I advise to try to follow these steps and ask relevant questions when you're stuck on a particular issue.

0

精彩评论

暂无评论...
验证码 换一张
取 消