开发者

Parsing XML Using minidom

开发者 https://www.devze.com 2023-03-19 12:00 出处:网络
I have an XML file in which I want to extract data from certain tags that are ONLY nested within other tags, i.e. the tags containing the data I want to extract occur elsewhere in the XML document.

I have an XML file in which I want to extract data from certain tags that are ONLY nested within other tags, i.e. the tags containing the data I want to extract occur elsewhere in the XML document.

Sample XML:

<root>
    <tag1>content I don't want</tag1>
    <tag2>content I don't want</tag2>
    <tag3>content I don't want</tag3>
    <item>
        <tag1>content I want</tag1>
        <tag2>content I want</tag2>
        <tag3>content I want</tag3>
    </item>
    <item>
        <tag1>content I want</tag1>
        <tag2>content I want</tag2>
        <tag3>content I want</tag3>
    </item>
</root>

Python code (which retrieves all data, including from the tags I don't want):

for counter in range(2):
    variable0 = XML_Document.getElementsByTagName('item')[counter]
    variable1 = XML_Document.getElementsByTagName('tag1')[counter].toxml(encoding="utf-8")
    variable2 = XML_Document.getElementsByTagName('tag2')[counter].toxml(encoding="utf-8")
    variable3 = XML_Document.getElementsByTagName('tag3')[counter].toxml(encoding="utf-8")
    print counter
    print variable1
    print variable2
    print variable3

How do I modify the loop to access only the data in the tags nested in the开发者_StackOverflow社区 item tags only?


You can always call getElementsByTagName() on any subnode:

for item in XML_Document.getElementsByTagName('item'):
    tag1 = item.getElementsByTagName('tag1')[0].toxml(encoding="utf-8")
    tag2 = item.getElementsByTagName('tag2')[0].toxml(encoding="utf-8")
    tag3 = item.getElementsByTagName('tag3')[0].toxml(encoding="utf-8")
    print tag1, tag2, tag3
0

精彩评论

暂无评论...
验证码 换一张
取 消