开发者

Beautiful Soup - identify tag based on position next to comment

开发者 https://www.devze.com 2023-02-16 02:00 出处:网络
I\'m using Beautiful Soup. Is there any way that I can get hold of a tag bas开发者_如何学Goed on its position next to a comment (something not included in the parse tree)?

I'm using Beautiful Soup.

Is there any way that I can get hold of a tag bas开发者_如何学Goed on its position next to a comment (something not included in the parse tree)?

For example, let's say I have...

<html>
<body>
<p>paragraph 1</p>
<p>paragraph 2</p>
<!--text-->
<p>paragraph 3</p>
</body>
</html>

In this example, how might I identify <p>paragraph 2</p> given that I'm searching for the comment "<!--text-->" ?

Thanks for any help.


Comments appear in the BeautifulSoup parse tree like any other node. For example, to find the comment with the text some comment text and then print out the previous <p> element you could do:

from BeautifulSoup import BeautifulSoup, Comment

soup = BeautifulSoup('''<html>
<body>
<p>paragraph 1</p>
<p>paragraph 2</p>
<!--some comment text-->
<p>paragraph 3</p>
</body>
</html>''')

def right_comment(e):
    return isinstance(e, Comment) and e == 'some comment text'

e = soup.find(text=right_comment)

print e.findPreviousSibling('p')

... that will print out:

<p>paragraph 2</p>
0

精彩评论

暂无评论...
验证码 换一张
取 消