I have this html:
<a href="/watch?gl=US&client=mv-google&hl=en&v=0C_yXOhJxWg">Miss Black OCU 2011</a开发者_StackOverflow中文版>
My program reads a html file, and above is the chunk of that file. How do I extract "Miss Black OCU 2011" using BeautifulSoup in python.
Here's a quick fix:
>>> from BeautifulSoup import BeautifulSoup as BS
>>> soup = BS('<a href="/watch?gl=US&client=mv-google&hl=en&v=0C_yXOhJxWg">Miss Black OCU 2011</a>')
>>> tags = soup.findAll('a', href=True)
>>> for tag in tags: tag.renderContents()
'Miss Black OCU 2011'
>>>
精彩评论