Below is script that I found on forum, and it is almost exactly what I need except I need to read like 30 different url's and print them all together.I have tried few options but script just breaks. How can I merge all 30's urls, parse, and than print them out.
If you can help me I would be very greatful, ty.
import sys
import string
from urllib2 import urlopen
import xml.dom.minidom
var_xml = urlopen("http://www.test.com/bla/bla.xml")
var_all = xml.dom.minidom.parse(var_xml)
def extract_content(var_all, var_tag, var_loop_count):
return var_all.firstChild.getElementsByTagName(var_tag)[var_loop_count].firstChild.data
var_loop_count = 0
var_item = " "
while len(var_item) > 0:
var_title = extract_content(var_all, "title", var_loop_count)
var_date = extract_content(var_all, "pubDate", var_loop_count)
print "Title: ", var_title
print "Published Date: ", var_date
print " "
var_loop_count += 1
try:
var_item = var_all.firstChild.getElementsByTagName("item")[var_loop_count].firstChild.data
exc开发者_运维技巧ept:
var_item = ""
If this is standard RSS, I'd encourage to use http://www.feedparser.org/ ; extracting all items there is straightforward.
You are overwriting var_item, var_title, var_date. each loop. Make a list of these items, and put each var_item, var_title, var_date in the list. At the end, just print out your list.
http://docs.python.org/tutorial/datastructures.html
精彩评论