开发者

XPath match every node containing text

开发者 https://www.devze.com 2023-02-23 08:53 出处:网络
How do I match all child nodes containing text recursively. If I have a tree like table tr td \"hello\" td

How do I match all child nodes containing text recursively.

If I have a tree like

table
 tr
  td
   "hello"
  td
   b
    "hi"
 tr
  td
   "salud"
  td
   em
    "bonjour"
开发者_运维技巧

How do I match every single string within the table node with xpath? Something like "//table/*/text()"?


The XPath expression you gave was almost correct already:

//table//text()

will get you all text nodes within all tables in the document.


How about the following?

from lxml import etree
from StringIO import StringIO

input = '''
<table>
 <tr>
  <td>hello</td>
  <td><b>hi</b></td>
 </tr>
 <tr>
  <td>salud</td>
  <td><em>bonjour</em></td>
 </tr>
</table>
'''

parser = etree.HTMLParser()
tree = etree.parse(StringIO(input), parser)

for p in tree.xpath("//table/tr/td//text()"):
    print p

... which gives the output:

hello
hi
salud
bonjour
0

精彩评论

暂无评论...
验证码 换一张
取 消