I'm using Beautiful Soup.
Is there any way that I can get hold of a tag bas开发者_如何学Goed on its position next to a comment (something not included in the parse tree)?
For example, let's say I have...
<html>
<body>
<p>paragraph 1</p>
<p>paragraph 2</p>
<!--text-->
<p>paragraph 3</p>
</body>
</html>
In this example, how might I identify <p>paragraph 2</p>
given that I'm searching for the comment "<!--text-->
" ?
Thanks for any help.
Comments appear in the BeautifulSoup parse tree like any other node. For example, to find the comment with the text some comment text
and then print out the previous <p>
element you could do:
from BeautifulSoup import BeautifulSoup, Comment
soup = BeautifulSoup('''<html>
<body>
<p>paragraph 1</p>
<p>paragraph 2</p>
<!--some comment text-->
<p>paragraph 3</p>
</body>
</html>''')
def right_comment(e):
return isinstance(e, Comment) and e == 'some comment text'
e = soup.find(text=right_comment)
print e.findPreviousSibling('p')
... that will print out:
<p>paragraph 2</p>
精彩评论