I am using the builtin Python ElementTree module. It is straightforward to access children,开发者_StackOverflow社区 but what about parent or sibling nodes? - can this be done efficiently without traversing the entire tree?
There's no direct support in the form of a parent
attribute, but you can perhaps use the patterns described here to achieve the desired effect. The following one-liner is suggested (updated from the linked-to post to Python 3.8) to create a child-to-parent mapping for a whole tree, using the method xml.etree.ElementTree.Element.iter
:
parent_map = {c: p for p in tree.iter() for c in p}
Vinay's answer should still work, but for Python 2.7+ and 3.2+ the following is recommended:
parent_map = {c:p for p in tree.iter() for c in p}
getiterator()
is deprecated in favor of iter()
, and it's nice to use the new dict
list comprehension constructor.
Secondly, while constructing an XML document, it is possible that a child will have multiple parents, although this gets removed once you serialize the document. If that matters, you might try this:
parent_map = {}
for p in tree.iter():
for c in p:
if c in parent_map:
parent_map[c].append(p)
# Or raise, if you don't want to allow this.
else:
parent_map[c] = [p]
# Or parent_map[c] = p if you don't want to allow this
You can use xpath ...
notation in ElementTree.
<parent>
<child id="123">data1</child>
</parent>
xml.findall('.//child[@id="123"]...')
>> [<Element 'parent'>]
As mentioned in Get parent element after using find method (xml.etree.ElementTree) you would have to do an indirect search for parent. Having xml:
<a>
<b>
<c>data</c>
<d>data</d>
</b>
</a>
Assuming you have created etree element into xml
variable, you can use:
In[1] parent = xml.find('.//c/..')
In[2] child = parent.find('./c')
Resulting in:
Out[1]: <Element 'b' at 0x00XXXXXX>
Out[2]: <Element 'c' at 0x00XXXXXX>
Higher parent would be found as:secondparent=xml.find('.//c/../..')
being <Element 'a' at 0x00XXXXXX>
Pasting here my answer from https://stackoverflow.com/a/54943960/492336:
I had a similar problem and I got a bit creative. Turns out nothing prevents us from adding the parent info ourselves. We can later strip it once we no longer need it.
def addParentInfo(et):
for child in et:
child.attrib['__my_parent__'] = et
addParentInfo(child)
def stripParentInfo(et):
for child in et:
child.attrib.pop('__my_parent__', 'None')
stripParentInfo(child)
def getParent(et):
if '__my_parent__' in et.attrib:
return et.attrib['__my_parent__']
else:
return None
# Example usage
tree = ...
addParentInfo(tree.getroot())
el = tree.findall(...)[0]
parent = getParent(el)
while parent:
doSomethingWith(parent)
parent = getParent(parent)
stripParentInfo(tree.getroot())
The XPath '..' selector cannot be used to retrieve the parent node on 3.5.3 nor 3.6.1 (at least on OSX), eg in interactive mode:
import xml.etree.ElementTree as ET
root = ET.fromstring('<parent><child></child></parent>')
child = root.find('child')
parent = child.find('..') # retrieve the parent
parent is None # unexpected answer True
The last answer breaks all hopes...
Got an answer from
https://towardsdatascience.com/processing-xml-in-python-elementtree-c8992941efd2
Tip: use '...' inside of XPath to return the parent element of the current element.
for object_book in root.findall('.//*[@name="The Hunger Games"]...'):
print(object_book)
If you are using lxml, I was able to get the parent element with the following:
parent_node = next(child_node.iterancestors())
This will raise a StopIteration
exception if the element doesn't have ancestors - so be prepared to catch that if you may run into that scenario.
import xml.etree.ElementTree as ET
f1 = "yourFile"
xmlTree = ET.parse(f1)
for root in xmlTree.getroot():
print(root.tag)
Another way if just want a single subElement's parent and also known the subElement's xpath.
parentElement = subElement.find(xpath+"/..")
Look at the 19.7.2.2. section: Supported XPath syntax ...
Find node's parent using the path:
parent_node = node.find('..')
精彩评论