My python program has a curious performance behavior: The longer it runs, the slower it gets. Early on, it cranks out tens of work units per minute. After an hour of so it is taking tens of minutes per work unit. My suspicion is that this is cause by a congested开发者_开发问答 garbage collector.
The catch is that my script is too memory hungry for cProfile to work on large runs. (see: cProfile taking a lot of memory)
We have written our own performance plugin and we can observe most parts of our system and none of them seem to be the problem. The one rock that is still unturned is the GC.
Is there some other way (besides profile or cProfile) to see how much time is going to the GC?
In Python, most garbage is collected using reference counting. One would expect this to be quick and painless, and it seems unlikely that this is what you're after. I assume you're asking about the collector referred to by the gc
module, which is only used for circular references.
There are a few things that might be of use: http://docs.python.org/library/gc.html
Although there doesn't appear to be a direct method to time the garbage collector, you can turn it on and off, enable debugging, look at collection count etc. All of this might be helpful in your quest.
For example, on my system gc
prints out the elapsed time if you turn on the debug flags:
In [1]: import gc
In [2]: gc.set_debug(gc.DEBUG_STATS)
In [3]: gc.collect()
gc: collecting generation 2...
gc: objects in each generation: 159 2655 7538
gc: done, 10 unreachable, 0 uncollectable, 0.0020s elapsed.
All of this aside, the first thing I would look at is the evolution of the memory usage of your program as it runs. One possibility is that it is simply reaching the limit of available physical RAM and is slowing down due to excessive page faults, rather than due to anything to do with the garbage collector.
精彩评论