How to properly subclass dict and override __getitem__ & __setitem___问答_开发者

I am debugging some code and I want to find out when a particular dictionary is accessed. Well, it's actually a class that subclass dict and implements a couple extra features. Anyway, what I would like to do is subclass dict myself and add override __getitem__ and __setitem__ to produce开发者_如何学JAVA some debugging output. Right now, I have

class DictWatch(dict):
    def __init__(self, *args):
        dict.__init__(self, args)

    def __getitem__(self, key):
        val = dict.__getitem__(self, key)
        log.info("GET %s['%s'] = %s" % str(dict.get(self, 'name_label')), str(key), str(val)))
        return val

    def __setitem__(self, key, val):
        log.info("SET %s['%s'] = %s" % str(dict.get(self, 'name_label')), str(key), str(val)))
        dict.__setitem__(self, key, val)

'name_label' is a key which will eventually be set that I want to use to identify the output. I have then changed the class I am instrumenting to subclass DictWatch instead of dict and changed the call to the superconstructor. Still, nothing seems to be happening. I thought I was being clever, but I wonder if I should be going a different direction.

Thanks for the help!

Another issue when subclassing dict is that the built-in __init__ doesn't call update, and the built-in update doesn't call __setitem__. So, if you want all setitem operations to go through your __setitem__ function, you should make sure that it gets called yourself:

class DictWatch(dict):
    def __init__(self, *args, **kwargs):
        self.update(*args, **kwargs)

    def __getitem__(self, key):
        val = dict.__getitem__(self, key)
        print('GET', key)
        return val

    def __setitem__(self, key, val):
        print('SET', key, val)
        dict.__setitem__(self, key, val)

    def __repr__(self):
        dictrepr = dict.__repr__(self)
        return '%s(%s)' % (type(self).__name__, dictrepr)
        
    def update(self, *args, **kwargs):
        print('update', args, kwargs)
        for k, v in dict(*args, **kwargs).items():
            self[k] = v

What you're doing should absolutely work. I tested out your class, and aside from a missing opening parenthesis in your log statements, it works just fine. There are only two things I can think of. First, is the output of your log statement set correctly? You might need to put a logging.basicConfig(level=logging.DEBUG) at the top of your script.

Second, __getitem__ and __setitem__ are only called during [] accesses. So make sure you only access DictWatch via d[key], rather than d.get() and d.set()

Consider subclassing UserDict or UserList. These classes are intended to be subclassed whereas the normal dict and list are not, and contain optimisations.

That should not really change the result (which should work, for good logging threshold values) : your init should be :

def __init__(self,*args,**kwargs) : dict.__init__(self,*args,**kwargs)

instead, because if you call your method with DictWatch([(1,2),(2,3)]) or DictWatch(a=1,b=2) this will fail.

(or,better, don't define a constructor for this)

As Andrew Pate's answer proposed, subclassing collections.UserDict instead of dict is much less error prone.

Here is an example showing an issue when inheriting dict naively:

class MyDict(dict):

  def __setitem__(self, key, value):
    super().__setitem__(key, value * 10)


d = MyDict(a=1, b=2)  # Bad! MyDict.__setitem__ not called
d.update(c=3)  # Bad! MyDict.__setitem__ not called
d['d'] = 4  # Good!
print(d)  # {'a': 1, 'b': 2, 'c': 3, 'd': 40}

UserDict inherits from collections.abc.MutableMapping, so this works as expected:

class MyDict(collections.UserDict):

  def __setitem__(self, key, value):
    super().__setitem__(key, value * 10)


d = MyDict(a=1, b=2)  # Good: MyDict.__setitem__ correctly called
d.update(c=3)  # Good: MyDict.__setitem__ correctly called
d['d'] = 4  # Good
print(d)  # {'a': 10, 'b': 20, 'c': 30, 'd': 40}

Similarly, you only have to implement __getitem__ to automatically be compatible with key in my_dict, my_dict.get, …

Note: UserDict is not a subclass of dict, so isinstance(UserDict(), dict) will fail (but isinstance(UserDict(), collections.abc.MutableMapping) will work).

All you will have to do is

class BatchCollection(dict):
    def __init__(self, inpt={}):
        super(BatchCollection, self).__init__(inpt)

A sample usage for my personal use

### EXAMPLE
class BatchCollection(dict):
    def __init__(self, inpt={}):
        super(BatchCollection, self).__init__(inpt)

    def __setitem__(self, key, item):
        if (isinstance(key, tuple) and len(key) == 2
                and isinstance(item, collections.Iterable)):
            # self.__dict__[key] = item
            super(BatchCollection, self).__setitem__(key, item)
        else:
            raise Exception(
                "Valid key should be a tuple (database_name, table_name) "
                "and value should be iterable")

Note: tested only in python3