开发者

From HTML to CSS style with Python

开发者 https://www.devze.com 2023-03-19 09:41 出处:网络
I\'开发者_Go百科ve extracted certain elements from a HTML page with beautifulsoup and want to extract the according CSS styles (which reside in most cases in external css files) via python.

I'开发者_Go百科ve extracted certain elements from a HTML page with beautifulsoup and want to extract the according CSS styles (which reside in most cases in external css files) via python.

How can i get a jQuery selector if I have a HTML element? If i had this selector, i could use cssutils to parse the CSS and get it.

TIA for help.


You may take a look at PyQuery's API. It provides similar css selector syntax as jQuery does and it's much faster than beautifulsoup because it relies on lxml to do the parsing work.

import pyquery.PyQuery as pq # you can treat it as a css selector

html = '<div class="foo"><a href="somewhere"></a></div>'
parsed = pq(html) # PyQuery object, is a callable

pq_list = parsed('.foo a') # doing css selection
for node in pq_list: # node here is a lxml element object
    print node.attrib['href'] # => somewhere
0

精彩评论

暂无评论...
验证码 换一张
取 消