I want to calculate performance of a function in micro second precision on Windows platform.
Now Windows itself has milisecond granuality, so how can I achieve this.
I tried following sample, but not getting correct results.
LARGE_INTEGER ticksPerSecond = {0};
LARGE_INTEGER tick_1 = {0};
LARGE_INTEGER tick_2 = {0};
double uSec = 1000000;
// Get the frequency
QueryPerformanceFrequency(&ticksPerSecond);
//Calculate per uSec freq
double uFreq = ticksPerSecond.QuadPart/uSec;
// Get count开发者_运维百科er b4 start of op
QueryPerformanceCounter(&tick_1);
// The ope itself
Sleep(10);
// Get counter after opfinished
QueryPerformanceCounter(&tick_2);
// And now the op time in uSec
double diff = (tick_2.QuadPart/uFreq) - (tick_1.QuadPart/uFreq);
Run the operation in a loop a million times or so and divide the result by that number. That way you'll get the average execution time over that many executions. Timing one (or even a hundred) executions of a very fast operation is very unreliable, due to multitasking and whatnot.
- compile it
- look at the assembler output
- count the number of each instruction in your function
- apply the cycles per instruction on your target processor
- end up with a cycle count
- multiply by the clock speed you are running at
- apply arbitrary scaling factors to account for cache misses and branch mis-predictions lol
(man I am so going to get down-voted for this answer)
No, you are probably getting an accurate result, QueryPerformanceCounter() works well for timing short intervals. What's wrong is the your expectation of the accuracy of Sleep(). It has a resolution of 1 millisecond, its accuracy is far worse. No better than about 15.625 milliseconds on most Windows machine.
To get it anywhere close to 1 millisecond, you'll have to call timeBeginPeriod(1) first. That probably will improve the match, ignoring the jitter you'll get from Windows being a multi-tasking operating system.
If you're doing this for offline profiling, a very simple way is to run the function 1000 times, measure to the closest millisecond and divide by 1000.
To get finer resolution than 1 ms, you will have to consult your OS documentation. There may be APIs to get timer resolution in microsecond resolution. If so, run your application many times and take the averages.
I like Matti Virkkunen's answer. Check the time, call the function a large number of times, check the time when you finish, and divide by the number of times you called the function. He did mention you might be off due to OS interrupts. You might vary the number of times you make the call and see a difference. Can you raise the priority of the process? Can you get it so all the calls within a single OS time slice?
Since you don't know when the OS might swap you out, you can put this all inside a larger loop to do the whole measurement a large number of times, and save the smallest number as that is the one that had the fewest OS interrupts. This still may be greater than the actual time for the function to execute because it may still contain some OS interrupts.
Sanjeet,
It looks (to me) like you're doing this exactly right. QueryPerformanceCounter is a perfectly good way to measure short periods of time with a high degree of precision. If you're not seeing the result you expected, it's most likely because the sleep isn't sleeping for the amount of time you expected it to! However, it is likely being measured correctly.
I want to go back to your original question about how to measure the time on windows with microsecond precision. As you already know, the high performance counter (i.e. QueryPerformanceCounter) "ticks" at the frequency reported by QueryPerformanceFrequency. That means that you can measure time with precision equal to:
1/frequency seconds
On my machine, QueryPerformanceFrequency reports 2337910 (counts/sec). That means that my computer's QPC can measure with precision 4.277e-7 seconds, or 0.427732 microseconds. That means that the smallest bit of time I can measure is 0.427732 microseconds. This, of course, gives you the precision that you originally asked for :) Your machine's frequency should be similar, but you can always do the math and check it.
Or you can use gettimeofday() which gives you a timeval struct that is a timestamp (down to µs)
精彩评论