开发者

Are Python docstrings and comments stored in memory when a module is loaded?

开发者 https://www.devze.com 2022-12-15 07:20 出处:网络
Are Python docstrings and comments stored in memory when a module is loaded? I\'ve wondered if this is true, because I usually document my code well;开发者_开发百科 may this affect memory usage?

Are Python docstrings and comments stored in memory when a module is loaded?

I've wondered if this is true, because I usually document my code well;开发者_开发百科 may this affect memory usage?

Usually every Python object has a __doc__ method. Are those docstrings read from the file, or processed otherwise?

I've done searches here in the forums, Google and Mailing-Lists, but I haven't found any relevant information.

Do you know better?


By default, docstrings are present in the .pyc bytecode file, and are loaded from them (comments are not). If you use python -OO (the -OO flag stands for "optimize intensely", as opposed to -O which stands for "optimize mildly), you get and use .pyo files instead of .pyc files, and those are optimized by omitting the docstrings (in addition to the optimizations done by -O, which remove assert statements). E.g., consider a file foo.py that has:

"""This is the documentation for my module foo."""

def bar(x):
  """This is the documentation for my function foo.bar."""
  return x + 1

you could have the following shell session...:

$ python -c'import foo; print foo.bar(22); print foo.__doc__'
23
This is the documentation for my module foo.
$ ls -l foo.pyc
-rw-r--r--  1 aleax  eng  327 Dec 30 16:17 foo.pyc
$ python -O -c'import foo; print foo.bar(22); print foo.__doc__'
23
This is the documentation for my module foo.
$ ls -l foo.pyo
-rw-r--r--  1 aleax  eng  327 Dec 30 16:17 foo.pyo
$ python -OO -c'import foo; print foo.bar(22); print foo.__doc__'
23
This is the documentation for my module foo.
$ ls -l foo.pyo
-rw-r--r--  1 aleax  eng  327 Dec 30 16:17 foo.pyo
$ rm foo.pyo
$ python -OO -c'import foo; print foo.bar(22); print foo.__doc__'
23
None
$ ls -l foo.pyo
-rw-r--r--  1 aleax  eng  204 Dec 30 16:17 foo.pyo

Note that, since we used -O first, the .pyo file was 327 bytes -- even after using -OO, because the .pyo file was still around and Python didn't rebuild/overwrite it, it just used the existing one. Removing the existing .pyo (or, equivalently, touch foo.py so that Python knows the .pyo is "out of date") means that Python rebuilds it (and, in this case, saves 123 bytes on disk, and a little bit more when the module's imported -- but all .__doc__ entries disappear and are replaced by None).


Yes the docstrings are read from the file, but that shouldn't stop you writing them. Never ever compromise readability of code for performance until you have done a performance test and found that the thing you are worried about is in fact the bottleneck in your program that is causing a problem. I would think that it is extremely unlikely that a docstring will cause any measurable performance impact in any real world situation.


They are getting read from the file (when the file is compiled to pyc or when the pyc is loaded -- they must be available under object.__doc__) but no --> this will not significantly impact performance under any reasonable circumstances, or are you really writing multi-megabyte doc-strings?


Do Python docstrings and comments are stored in memory when module is loaded?

Docstrings are compiled into the .pyc file, and are loaded into memory. Comments are discarded during compilation and have no impact on anything except the insignificant extra time taken to ignore them during compilation (which happens once only after any change to a .py file, except for the main script which is re-compiled every time it is run).

Also note that these strings are preserved only if they are the first thing in the module, class definition, or function definition. You can include additional strings pretty much anywhere, but they will be discarded during compilation just as comments are.


As other answers mentioned, comments are discarded in compilation process but docstrings are stored in .pyc file and are loaded into the memory.

In .pyc files, there are code objects that are serialized with marshal. Although it's not supposed to be readable but you can still find something. So why not just see that it is indeed in .pyc file?

import marshal

text = '''def fn():
    """ZZZZZZZZZZZZZZZZZZ"""
    # GGGGGGGGGGGGGGGGGGG'''

code_object = compile(text, '<string>', 'exec')
serialized = marshal.dumps(code_object)
print(serialized)
print(b"ZZZZZZZZZZZZZZZZZZ" in serialized)
print(b"GGGGGGGGGGGGGGGGGGG" in serialized)

output:

b'\xe3\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00@\x00\x00\x00s\x0c\x00\x00\x00d\x00d\x01\x84\x00Z\x00d\x02S\x00)\x03c\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00C\x00\x00\x00s\x04\x00\x00\x00d\x01S\x00)\x02Z\x12ZZZZZZZZZZZZZZZZZZN\xa9\x00r\x01\x00\x00\x00r\x01\x00\x00\x00r\x01\x00\x00\x00\xfa\x08<string>\xda\x02fn\x01\x00\x00\x00s\x02\x00\x00\x00\x00\x01r\x03\x00\x00\x00N)\x01r\x03\x00\x00\x00r\x01\x00\x00\x00r\x01\x00\x00\x00r\x01\x00\x00\x00r\x02\x00\x00\x00\xda\x08<module>\x01\x00\x00\x00\xf3\x00\x00\x00\x00'
True
False

where is it referenced in the function's code object? in .co_consts

new_code_object = marshal.loads(serialized)
print(new_code_object.co_consts[0].co_consts[0])

output:

ZZZZZZZZZZZZZZZZZZ

def fn():
    """ZZZZZZZZZZZZZZZZZZ"""
    # GGGGGGGGGGGGGGGGGGG

print(fn.__code__.co_consts[0] is fn.__doc__) # True
0

精彩评论

暂无评论...
验证码 换一张
取 消