I need to get content for this XPath:
/html/body/div/table[2]/tbody/tr/td[2]
It's copied from FireBug. How can I do this? I have a very large HTML document, so I don't want (an开发者_Python百科d don't know how:) ) to grep it. Thanks.
lxml can handle html (and provides pretty good xpath support):
>>> import lxml.html
>>> tree = lxml.html.parse('test.html')
>>> for node in tree.xpath('/html/body/div/table[2]/tbody/tr/td[2]'):
... print node.text
...
first row, second column
second row, second column
Just make sure that you use it's html parser.
import lxml.html as h
tree = h.parse("keys_results.html")
text = tree.xpath("string(//*[contains(text(),'needed_text')])")
print text
精彩评论