Python class syntax - is this a good idea?_问答_开发者

I'm tempted to define my Python classes like this:

class MyClass(object):
    """my docstring"""

    msg = None
    a_variable = None
    some_dict = 开发者_JAVA百科{}

    def __init__(self, msg):
        self.msg = msg

Is declaring the object variables (msg, a_variable, etc) at the top, like Java good or bad or indifferent? I know it's unnecessary, but still tempting to do.

Defining variables in the class defintion like that makes the variable accessible between every instance of that class. In Java terms it is a bit like making the variable static. However, there are major differences as show below.

class MyClass(object):
    msg = "ABC"

print MyClass.msg     #prints ABC
a = MyClass()
print a.msg           #prints ABC
a.msg = "abc"
print a.msg           #prints abc
print MyClass.msg     #prints ABC
print a.__class__.msg #prints ABC

As seen from the above code, it is not quite the same thing, as while the variable can be access via self.msg, when it is assigned a value it is not assigned to the variable defined at class scope.

One of the disadvantage of doing it via the method you do is that it can lead to errors as it adds hidden state the the class. Say someone left out self.msg = "ABC" from the constructor (Or more realistically code was refactored and only one of the definitions was altered)

a = MyClass()
print a.msg   #prints ABC

#somewhere else in the program
MyClass.msg = "XYZ"

#now the same bit of code leads to a different result, despite the expectation that it
#leads to the same result.
a = MyClass()
print a.msg   #prints XYZ

Far better to avoid defining msg at the class level and then you avoid the issues:

class MyClass(object):
    pass

print MyClass.msg #AttributeError: type object 'MyClass' has no attribute 'msg'

Declaring variables directly inside the class definition makes them class variables instead of instance variables. Class variables are somewhat similar to static variables in Java and should be used like MyClass.a_variable. But they can also be used like self.a_variable, which is a problem because naive programmers can treat them as instance variables. Your "some_dict" variable, for example, would be shared by each instance of MyClass, so if you add a key "k" to it, that will be visible to any instance.

If you always remember to re-assign class variables, there's almost no difference to instance variables. Only the initial definition in MyClass will remain. But anyway, that's not good practice as you might run into trouble when not re-assigning those variables!

Better write the class like so:

class MyClass(object):
    """
    Some class
    """

    def __init__(self, msg):
        self.__msg = msg
        self.__a_variable = None
        self.__some_dict = {}

Using two underscores for "private" variables (pseudo-private!) is optional. If the variables should be public, just keep their names without the __ prefix.

Careful. The two msg attributes are actually stored in two different dictionaries. One overshadows the other, but the clobbered msg attribute is still taking up space in a dictionary. So it goes unused and yet still takes up some memory.

class MyClass(object):    
    msg = 'FeeFiFoFum'   
    def __init__(self, msg):
        self.msg = msg

m=MyClass('Hi Lucy')

Notice that we have 'Hi Lucy' as the value.

print(m.__dict__)
# {'msg': 'Hi Lucy'}

Notice that MyClass's dict (accessed through m.__class__) still has FeeFiFoFum.

print(m.__class__.__dict__)
# {'__dict__': <attribute '__dict__' of 'MyClass' objects>, '__module__': '__main__', '__init__': <function __init__ at 0xb76ea1ec>, 'msg': 'FeeFiFoFum', 'some_dict': {}, '__weakref__': <attribute '__weakref__' of 'MyClass' objects>, '__doc__': 'my docstring', 'a_variable': None}

Another (perhaps simpler) way to see this:

print(m.msg)
# Hi Lucy
print(MyClass.msg)
# FeeFiFoFum

When you declare a class, Python will parse its code and put everything in the namespace of the class; then the class will be used as a kind of template for all objects derived from it - but any object will have its own copy of the reference.
Note that you always have a reference; as such, if you are able to alter the referenced object, the change will reflect into all places it is being used. However, the slot for the member data is unique for each instance, and therefore assigning it to a new object will not reflect to any other place it is being used.

Note: Michael Foord has a very nice blog entry on how class instantiation works; if you are interested in this topic, I suggest you that short reading.

Anyway, for all practical uses, there are two main differences between your two approaches:

The name is already available at class level, and you can use it without instantiating a new object; this may sound neat for declaring constants in namespaces, but in many cases the module name may already be a good one.
The name is added at class level - it means that you may not be able to mock it easily during unit tests, and that if you have any expensive operation, you get it at the very moment of the import.

Usually, reviewing code I see members declared at class level with a bit of suspicion; there are a lot of good usecases for them, but it is also quite likely they are there as a kind of habit from previous experiences with other programming languages.