I'm trying to make it so this script
from BeautifulSoup import BeautifulSoup
import sys, re, urllib2
import codecs
html_str = urllib2.urlopen(URL).read()
soup = Beau开发者_JAVA百科tifulSoup(html_str)
for row in soup.findAll("tr"):
for col in row.findAll(re.compile("td|th")):
for
sys.stdout.write((col.string if col.string else '') + '|')
print # Newline
sends it's output to a text file instead.
Easiest? (if *nix):-
python file.py > filename.txt
Code wise though:-
from BeautifulSoup import BeautifulSoup
import sys, re, urllib2
import codecs
html_str = urllib2.urlopen(URL).read()
soup = BeautifulSoup(html_str)
file = open('file.txt', 'w')
for row in soup.findAll("tr"):
for col in row.findAll(re.compile("td|th")):
file.write((col.string if col.string else '') + '|')
file.close()
精彩评论