开发者

microsecond profiler for C code

开发者 https://www.devze.com 2023-01-01 07:46 出处:网络
Does any body know of C cod开发者_开发百科e profiler like gprofwhich gives function call times in microseconds instead of milliseconds?Take a look at Linux perf. You will need a pretty recent kernel t

Does any body know of C cod开发者_开发百科e profiler like gprof which gives function call times in microseconds instead of milliseconds?


Take a look at Linux perf. You will need a pretty recent kernel though.


Let me just suggest how I would handle this, assuming you have the source code.

Knowing how long a function takes inclusively per invocation (including I/O), on average, multiplied by the number of invocations, divided by the total running time, would give you the fraction of time under the control of that function. That fraction is how you know if the function is a sufficient time-taker to bother optimizing. That is not easy information to get from gprof.

Another way to learn what fraction of inclusive time is spent under the control of each function is timed or random sampling of the call stack. If a function appears on a fraction X of the samples (even if it appears more than once in a sample), then X is the time-fraction it takes (within a margin of error). What's more, this gives you per-line fraction of time, not just per-function.

That fraction X is the most valuable information you can get, because that is the total amount of time you could potentially save by optimizing that function or line of code.

The Zoom profiler is a good tool for getting this information.

What I would do is wrap a long-running loop around the top-level code, so that it executes repeatedly, long enough to take at least several seconds. Then I would manually sample the stack by interrupting or pausing it at random. It actually takes very few samples, like 10 or 20, to get a really clear picture of the most time-consuming functions and/or lines of code.

Here's an example.

P.S. If you're worried about statistical accuracy, let me get quantitative. If a function or line of code is on the stack exactly 50% of the time, and you take 10 samples, then the number of samples that show it will be 5 +/- 1.6, for a margin of error of 16%. If the actual time is smaller or larger, the margin of error shrinks. You can also reduce the margin of error by taking more samples. To get 1.6%, take 1000 samples. Actually, once you've found the problem, it's up to you to decide if you need a smaller margin of error.


gprof gives results either in milliseconds or in microseconds. I do not know the exact rationale, but my experience is that it will display results in microseconds when it thinks that there is enough precision for it. To get microsecond output, you need to run the program for longer time and/or do not have any routine that takes too much time to run.


oprofile gets you times in clock resolution, i.e. nanoseconds, it produces output files compatible with gprof so very convenient to use.

http://oprofile.sourceforge.net/news/

0

精彩评论

暂无评论...
验证码 换一张
取 消