I am using SUSE 10 Linux on a machine with 16 G ram and 2 quad core CPUs. There are 8 processes which are doing some work (CPU intensive/network i/o). Out of which 4 have a memory leak (These are test conditions so no problem in having leaks here). Total space is occupied by all processes is around 15.4 G only 200开发者_如何转开发 MB is free in system. Things are fine for some hours. But after that malloc hangs (for a process which doesn't have a memory leak). Its stuck for more than 4 minutes (Note CPU is not 100% but io has gone up signficantly). Now there is no problem in the hanged process (it has not corrupted the memory). What is malloc doing? (is it trying to defragment or building up swap space).
Any pointers?
If malloc()
simply takes a long time, you're probably traversing a fragmented free list, many of whose entries have been swapped out. That is consistent with low CPU, high IO, and limited free RAM.
For more information on malloc()
implementations (including understanding fragmented free lists), the Wikipedia article is good: http://en.wikipedia.org/wiki/Malloc#Implementations
Oh, and memory leaks aren't acceptable, even in a test environment. As you can see, they're interfering with programs that (as far as you know) don't have leaks, and costing you time.
It might be annoying, but I would recommend using Valgrind on the process that blocks. There might be errors you didn't detect before. At least, you might have an idea of what is happening. However, the few hours might become days :/
Before you machine was just short on life RAM. Now your malloc goes beyond the 16G limit of your machine and your system starts swapping. But checking your application as hinted by PierreBdR is certainly a good idea.
精彩评论