linux gcc 4.4.1 C99
I am won开发者_如何学Pythondering what is the best way to test the performance of a C program.
I have some functions that I have implemented. However, I could have used a different design for each function.
Basically, I should want to test to see which design gives better performance.
Many thanks,
Take a look at this post on code profilers.
I want to test to see which design gives better performance.
Why does it matter? This is not a flip question! You should have a performance target in mind, and if you meet it, your code is fast enough.
How do you know how fast is "fast enough"? It turns out the user-interface people have good data on the effect of response time on your users' experience:
0.1 second is about the limit for having the user feel that the system is reacting instantaneously, meaning that no special feedback is necessary except to display the result. (Most people have a reaction time of about 0.1 seconds; jet fighter pilots get down to around 0.08s, i.e., 80ms.)
1 second is about the limit for the user's flow of thought to stay uninterrupted, even though the user will notice the delay. Normally, no special feedback is necessary during delays of more than 0.1 but less than 1.0 second, but the user does lose the feeling of directly "driving" your application.
10 seconds is about the limit for keeping the user's attention focused on the app. For longer delays, users will want to perform other tasks while waiting for the computer to finish, so they should be given feedback indicating when the computer expects to be done. Feedback during the delay is especially important if the response time is hard to predict or varies a lot.
The quantitative results above apply only to interaction, of course, which is measured in seconds of waiting time. But even if your target is network packets sent, pages of RAM allocated, blocks of disk read/written, or just watts of power consumed, the message I am trying to communicate is that you should have a performance target, that target should be quantified, and the target should be connected to the needs of your users. If you don't have a quantifiable target, you're not doing engineering; you're just whistling in the dark. Unless your goal is to educate yourself (or to satisfy idle curiosity), the question you should be asking is "is my code good enough that I can move on?"
If you're not meeting your performance target, or if you are trying to educate yourself, I think the best combination of readable and detailed information comes from using the valgrind profiler (--tool=callgrind --dump-instr=yes
) together with the kcachegrind
visualizer.
Mostly you would like to use a profiler. The post pointed by Fragsworth is a good start. Personally, I prefer Shark for Mac OS X, and gprof for Linux.
But in your case, you may also call clock() or getrusage(), for example, in this way:
clock_t t = clock();
for (i = 0; i < 1000; ++i) my_func();
printf("time = %lf\n", (double)(clock() - t) / CLOCKS_PER_SEC);
Profiler is useful when you want to dig out which part of code takes most time. Calling clock()/getrusage() is more convenient (to me) when you want to compare/benchmark different implementations.
You can use gprof ,which is a free profiler .
The first thing to find out is whether you need to optimize those functions. Unless they are in the critical path for your code, they may be more then fast enough.
If you have profiled your application and found they are slow, one good way to test to performance is to call the function some large number of times and to find out the average time it takes to run.
You should also try to use CPU-time instead of wallclock-time as that is a more accurate gauge.
I addition to profiling you need to be running the code under test from a harness (driver) to average out the readings. In this way your comparisons are not skewed by one off readings, so you have a large sample population with mean and Standard Deviation to compare. There are many multi-threaded frameworks that can achieve the load driving for you.
精彩评论