OK. I have a massive HTML file, and I only want the text that occurs between the tags
<center><span style="font-size: 144%;"></span></center>
and
<dl> <dd><i></i&g开发者_开发知识库t;</dd> </dl>
I am using Python2.6 and Beautifulsoup, but I have no idea where to begin. I'm assuming it's not difficult?
Try something like:
soup = BeautifulSoup.BeautifulSoup(YOUR_HTML)
texts = soup.findAll(text=True)
print texts
精彩评论