开发者

Performance/profiling measurement in C

开发者 https://www.devze.com 2022-12-16 01:27 出处:网络
I\'m doing some prototyping work in C, and I want to compare how long a program takes to complete with various small modifications.

I'm doing some prototyping work in C, and I want to compare how long a program takes to complete with various small modifications.

I've been using clock; from K&R:

clock returns the processor time used by the program since the beginning of execution, or -1 if unavailable.

This seems sensible to me, and has been giving results which broadly match my expectations. But is there something better to use to see what modifications improve/worsen the efficiency of my code?

Update: I'm interested in both Windows and Linux here; something that works on both would be ideal.

Update 2: I'm less interested in profiling a complex problem than total run time/clock cycles used for a simple program from start to finish—I already know which parts of my program are slow. clock appears to fit this bill, but I don't know how vulnerable it is to, for example, other processes running in开发者_运维知识库 the background and chewing up processor time.


Forget time() functions, what you need is:

Valgrind!

And KCachegrind is the best gui for examining callgrind profiling stats. In the past I have ported applications to linux just so I could use these tools for profiling.


For a rough measurement of overall running time, there's time ./myprog.

But for performance measurement, you should be using a profiler. For GCC, there is gprof.

This is both assuming a Unix-ish environment. I'm sure there are similar tools for Windows, but I'm not familiar with them.

Edit: For clarification: I do advise against using any gettime() style functions in your code. Profilers have been developed over decades to do the job you are trying to do with five lines of code, and provide a much more powerful, versatile, valuable, and fool-proof way to find out where your code spends its cycles.


I've found that timing programs, and finding things to optimize, are two different problems, and for both of them I personally prefer low-tech.

For timing, the trick is to make it take long enough by wrapping a loop around it. For example, if you iterate an operation 1000 times and time it with a stopwatch, then seconds become milliseconds when you remove the loop.

For finding things to optimize, there are pieces of code (terminal instructions and function calls) that are responsible for various fractions of the time. During that time, they are exposed on the stack. So you can wrap a loop around the program to make it take long enough, and then take stackshots. The code to optimize will jump out at you.


In POSIX (e.g. on Linux), you can use gettimeofday() to get higher-precision timing values (microseconds).

In Win32, QueryPerformanceCounter() is popular.

Beware of CPU clock-changing effects, if your CPU decides to clock down during the test, results may be skewed.


If you can use POSIX functions, have a look at clock_gettime. I found an example from a quick google search on how to use it. To measure processor time taken by your program, you need to pass CLOCK_PROCESS_CPUTIME_ID as the first argument to clock_gettime, if your system supports it. Since clock_gettime uses struct timespec, you can probably get useful nanosecond resolution.

As others have said, for any serious profiling work, you will need to use a dedicated profiler.

0

精彩评论

暂无评论...
验证码 换一张
取 消