I am running gprof on my e开发者_Python百科xecutable, but the executable spends a lot of time wait()
ing for child processes to complete. Is the time spent waiting factored in to the gprof timings?
I haven't used gprof much, but to my knowledge, neither the wait
nor the child processes per see are profiled.
See a simple example:
#include <stdlib.h>
#include <unistd.h>
#include <limits.h>
void slow_function()
{
unsigned int i;
for (i = 0; i < UINT_MAX; i++);
}
void quick_function(pid_t child)
{
int status;
waitpid(child, &status, 0);
return;
}
int main(int argc, const char *argv[])
{
pid_t child;
child = fork();
if (child == 0) // child process
{
slow_function();
exit(0);
}
else
quick_function(child);
return 0;
}
The gprof
output for this is (on my machine):
% cumulative self self total
time seconds seconds calls Ts/call Ts/call name
0.00 0.00 0.00 1 0.00 0.00 quick_function
If you actually want to profile the childs/threads, I'd suggest this as a starting point.
It seems there is an option to log fork'ed processes, this ibm article talks about it a bit.
The same article recommends trying tprof, it is similar to gprof in use, but uses different methods under the hood that might give a more accurate picture for multi-process/multi-thread applications.
gprof only counts actual CPU time in your process. What works a lot better is something that samples the call stack, and samples it on wall-clock time, not CPU time. Of course, samples should not be taken while waiting for user input (or if they are taken, they should be discarded). Some profilers can do all this, such as RotateRight/Zoom, or you can use pstack or lsstack, but here's a simple way to do it.
精彩评论