I want to use the Hardware Performance Counters that come with the Intel and AMD x86_64 multi开发者_如何学JAVAcore processors to calculate the number of retired stores by a program. I want each thread to calculate its retired stores separately. Can it be done? And if so, how in C/C++?
You can use Perfctr or PAPI if you want to count hardware events on some part of the program internally (without starting any 3rd party tool).
Perfctr quickstart: http://www.ale.csce.kyushu-u.ac.jp/~satoshi/how_to_use_perfctr.htm
PAPI homepage: http://icl.cs.utk.edu/papi/
PerfSuite good doc: http://perfsuite.ncsa.illinois.edu/publications/LJ135/x27.html
If you can do this externally, there is a perf
command of modern Linux.
perf wiki: https://perf.wiki.kernel.org/index.php/Main_Page
The best approach will be using perf in linux as osgx mentioned, as it is part of linux kernel. But it CAN be called in the C/C++ code as well, and there is no need for it to be external perf stat calls.
Just download the kernel source code and take a look at it. Or alternatively take a look at this library I think by google:
http://perfmon2.sourceforge.net/docs_v4.html
it is part of perfmon2 project but is designed to work with perf. Take a look at perf_examples directory and you will get the idea. That is how I handle perf calls from within my C codes.
The official application from AMD is named CodeAnalyst
Checked out oprofile
yet?
http://oprofile.sourceforge.net/
精彩评论