Is a Python closure a good replacement for `__all__`?_问答_开发者

Is it a good idea to use a closure instead of __all__ to limit the names exposed by a Python module? This would prevent programmers from accidentally using the wrong name for a module (import urllib; urllib.os.getlogin()) as well as avoiding "from x import *" namespace pollution as __all__.

def _init_module():
   global foo
   import bar
   def foo():
       return bar.baz.operation()
   class Quux(bar.baz.Splort): pass
_init_module(); del _init_module

vs. the same module using __all__:

__all__ = ['foo']
import bar
def foo():
    return bar.baz.operation()
class Quux(bar.baz.Splort): pass

Functions could just adopt this style to avoid polluting the module namespace:

def foo():
    import bar
    bar.baz.operation()

This might be helpful for a large package that wants to help users distinguish its API from the package's use of its and other modules' API during interactive introspection. On the other hand, maybe IPython should simply distinguish names in __all__ during tab completion, and more users should use an IDE that allows them to jump between file开发者_运维知识库s to see the definition of each name.

I am a fan of writing code that is absolutely as brain-dead simple as it can be.

__all__ is a feature of Python, added explicitly to solve the problem of limiting what names are made visible by a module. When you use it, people immediately understand what you are doing with it.

Your closure trick is very nonstandard, and if I encountered it, I would not immediately understand it. You would need to put in a long comment to explain it, and then you would need to put in another long comment to explain why you did it that way instead of using __all__.

EDIT: Now that I understand the problem a little better, here is an alternate answer.

In Python it is considered good practice to prefix private names with an underscore in a module. If you do from the_module_name import * you will get all the names that do not start with an underscore. So, rather than the closure trick, I would prefer to see correct use of the initial-underscore idiom.

Note that if you use the initial underscore names, you don't even need to use __all__.

The problem with from x import * is that it can hide NameErrors which makes trivial bugs hard to track down. "namespace pollution" means adding stuff to the namespace that you have no idea where it came from.

Which is kind of what your closure does too. Plus it might confuse IDEs, outlines, pylint and the like.

Using the "wrong" name for a module is not a real problem either. Module objects are the same from wherever you import them. If the "wrong" name disappears (after a update) it should be clear why and motivate the programmer to do it properly next time. But it doesn't cause bugs.

Okay, I'm beginning to understand this issue a bit more. The closure really does allow for hiding private stuff. Here's a simple example.

Without the closure:

# module named "foo.py"
def _bar():
    return 5

def foo():
    return _bar() - 2

With the closure:

# module named "fooclosure.py"
def _init_module():
    global foo
    def _bar():
        return 5

    def foo():
        return _bar() - 2

_init_module(); del _init_module

Sample of usage:

>>> import foo
>>> dir(foo)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', '_bar', 'foo']
>>>
>>> import fooclosure
>>> dir(fooclosure)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', 'foo']
>>>

This is actually disturbingly subtle. In the first case, function foo() simply has a reference to the name _bar(), and if you were to remove _bar() from the name space, foo() would stop working. foo() looks up _bar() each and every time it runs.

In contrast, the closure version of foo() works without _bar() existing in the name space. I'm not even certain how it works... is it holding a reference to the function object created for _bar(), or is it holding a reference to a name space that still exists, such that it can look up the name _bar() and find it?