Is there a way to profile memor开发者_运维技巧y of a multithread program in Python?
For CPU profiling, I am using the cProfile to create seperate profiler stats for each thread and later combine them. However, I couldn't find a way to do this with memory profilers. I am using heapy.
Is there a way to combine stats in heapy like the cProfile? Or what other memory profilers would you suggest that is more suitable for this task.
A related question was asked for profiling CPU usage over multi-thread program: How can I profile a multithread program in Python?
Also another question regarding the memory profiler: Python memory profiler
If you are happy to profile objects rather than raw memory, you can use the gc.get_objects()
function so you don't need a custom metaclass. In more recent Python versions, sys.getsizeof()
will also let you take a shot at figuring out how much underlying memory is in use by those objects.
There are ways to get valgrind to profile memory of python programs: http://www.python.org/dev/faq/#can-i-run-valgrind-against-python
Ok. What I was exactly looking for does not seem to exist. So, I found a solution-a workaround for this problem.
Instead of profiling memory, I'll profile objects. This way, I'll be able to see how many objects exist at a specific time in the program. In order to achieve my goal, I made use of metaclasses with minimal modification to already existing code.
The following metaclass adds a very simple subroutine to __init__
and __del__
functions of the class. The subroutine for __init__
increases the number of objects with that class name by one and the __del__
decreases by one.
class ObjectProfilerMeta(type):
#Just set metaclass of a class to ObjectProfilerMeta to profile object
def __new__(cls, name, bases, attrs):
if name.startswith('None'):
return None
if "__init__" in attrs:
attrs["__init__"]=incAndCall(name,attrs["__init__"])
else:
attrs["__init__"]=incAndCall(name,dummyFunction)
if "__del__" in attrs:
attrs["__del__"]=decAndCall(name,attrs["__del__"])
else:
attrs["__del__"]=decAndCall(name,dummyFunction)
return super(ObjectProfilerMeta, cls).__new__(cls, name, bases, attrs)
def __init__(self, name, bases, attrs):
super(ObjectProfilerMeta, self).__init__(name, bases, attrs)
def __add__(self, other):
class AutoClass(self, other):
pass
return AutoClass
The incAndCall and decAndCall functions use use global variable of the module they exist.
counter={}
def incAndCall(name,func):
if name not in counter:
counter[name]=0
def f(*args,**kwargs):
counter[name]+=1
func(*args,**kwargs)
return f
def decAndCall(name,func):
if name not in counter:
counter[name]=0
def f(*args,**kwargs):
counter[name]-=1
func(*args,**kwargs)
return f
def dummyFunction(*args,**kwargs):
pass
The dummyFunction is just a very simple workaround. I am sure there are much better ways to do it.
Finally, whenever you want to see the number of objects that exist, you just need to look at the counter dictionary. An example;
>>> class A:
__metaclass__=ObjectProfilerMeta
def __init__(self):
pass
>>> class B:
__metaclass__=ObjectProfilerMeta
>>> l=[]
>>> for i in range(117):
l.append(A())
>>> for i in range(18):
l.append(B())
>>> counter
{'A': 117, 'B': 18}
>>> l.pop(15)
<__main__.A object at 0x01210CB0>
>>> counter
{'A': 116, 'B': 18}
>>> l=[]
>>> counter
{'A': 0, 'B': 0}
I hope this helps you. It was sufficient for my case.
I've used Yappi, which I've had success with for a few special multi-threaded cases. It's got great documentation so you shouldn't have too much trouble setting it up.
For memory specific profiling, check out Heapy. Be warned, it may create some of the largest log files you've ever seen!
精彩评论