开发者

lxml objectify does not call constructors for custom element classes

开发者 https://www.devze.com 2023-01-31 20:06 出处:网络
lxml.objectify does not seem to call the constructors for my custom element classes: from lxml import objectify, etree

lxml.objectify does not seem to call the constructors for my custom element classes:

from lxml import objectify, etree

class CustomLookup(etree.CustomElementClassLookup):
    def lookup(self, node_type, document, namespace, name):
        lookupmap = { 'custom' : CustomElement }
        try:
            return lookupmap[name]
        except KeyError:
            return None

class CustomElement(etree.ElementBase):
    def __init__(self):
        print("Made CustomElement")

pars开发者_StackOverflow社区er = objectify.makeparser()
parser.set_element_class_lookup(CustomLookup())
root = objectify.parse(fname,parser).getroot()

Suppose the file being parsed is

<custom />

I would like this to print "Made CustomElement", but it does not. Can I make it call the constructor?

How is it possible for an instance of the CustomElement class to be created without the constructor being called?

>>> isinstance(root,CustomElement)
True


From the lxml docs:

Element initialization

There is one thing to know up front. Element classes must not have an __init___ or __new__ method. There should not be any internal state either, except for the data stored in the underlying XML tree. Element instances are created and garbage collected at need, so there is no way to predict when and how often a proxy is created for them. Even worse, when the __init__ method is called, the object is not even initialized yet to represent the XML tag, so there is not much use in providing an __init__ method in subclasses.

Most use cases will not require any class initialisation, so you can content yourself with skipping to the next section for now. However, if you really need to set up your element class on instantiation, there is one possible way to do so. ElementBase classes have an _init() method that can be overridden. It can be used to modify the XML tree, e.g. to construct special children or verify and update attributes.

The semantics of _init() are as follows:

  • It is called once on Element class instantiation time. That is, when a Python representation of the element is created by lxml. At that time, the element object is completely initialized to represent a specific XML element within the tree.

  • The method has complete access to the XML tree. Modifications can be done in exactly the same way as anywhere else in the program.

  • Python representations of elements may be created multiple times during the lifetime of an XML element in the underlying C tree. The _init() code provided by subclasses must take special care by itself that multiple executions either are harmless or that they are prevented by some kind of flag in the XML tree. The latter can be achieved by modifying an attribute value or by removing or adding a specific child node and then verifying this before running through the init process.

  • Any exceptions raised in _init() will be propagated throught the API call that lead to the creation of the Element. So be careful with the code you write here as its exceptions may turn up in various unexpected places.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号