Are there any standard ways ( using profilers ) to check if using these gcc recognized branch prediction macros can benefit certain clock cycles in terms of instruction pipelining? How can we measure this with and without usage of these macros in a program? Is measuring the elapsed time the only w开发者_开发技巧ay to do it?
Are there similar branch prediction macros in Windows ( assume keywork for example? )
-Kartlee
I’m not familiar with any profilers that will show branch efficiencies. The Linux time
program should work well enough to help you benchmark.
On all modern x86 CPUs, JMPcc instructions are faster if they don’t branch and instead just fall through to the next instruction.
GCC’s __builtin_expect
function provides a hint to the compiler—it tells which side of an if() should be the fall-through and which side should be the branch. You should only use this function if you are 100% sure about it. There is no equivalent function for VC++. I’m not sure about ICC.
A better way to do this is to avoid these non-standard functions and use Profile Guided Optimization (PGO), in which you run the program and it records all these branches to figure out where stuff goes.
精彩评论