开发者

etree.findall: 'OR'-lookup?

开发者 https://www.devze.com 2022-12-22 21:34 出处:网络
I want to find all stylesheet definitions in a XHTML file with lxml.etree.findall. This could be as simple as

I want to find all stylesheet definitions in a XHTML file with lxml.etree.findall. This could be as simple as

elems = tree.findall('link[@rel="stylesheet"]') + tree.findall('style')

But the problem with CSS style definitions is that the order matters, e.g.

<link rel="stylesheet" type="text/css" href="/media/css/first.css" />
<style>body:{font-size: 10px;}</style>
<link rel="stylesheet" type="text/css" href="/media/css/second.css" />

if the contents of the style tag is applied after the rules in the two link tags, the result may be completely different from the one where the rules are applied in order of definition.

So, how would I do a lookup that inlcudes both link[@rel="stylesheet"] and 开发者_如何学JAVAstyle?


Possible using XPATH:

data = """<link rel="stylesheet" type="text/css" href="/media/css/first.css" />
<style>body:{font-size: 10px;}</style>
<link rel="stylesheet" type="text/css" href="/media/css/second.css" />
"""

from lxml import etree

h = etree.HTML(data)

h.xpath('//link[@rel="stylesheet"]|//style')

[<Element link at 97a007c>,
 <Element style at 97a002c>,
 <Element link at 97a0054>]
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号