In order to force a function to not be inlined that was consuming 46% of the runtime, I used __attribute__((no开发者_如何学JAVAinline))
on the it and compiled the code with gcc -Wall -Winline -O2
(these plus -g
are what is used by the Makefile - I also see roughly the same effect when using -g
as well) using gcc 4.5.2. I found that the program with the non-inlined function is more than 20% faster than the original. Does anyone know why this might be?
Let me provide some more details. The program that this occurred in is the latest version of the compression utility bzip2 for Linux. The key function ( generateMTFValues found in compress.c) in the program is the one that does the Move To Front transform. This function is only called by one function in the program.
Does anyone have any idea why the program runs faster in this case by forcing the compiler not to inline this function? The function only takes one parameter - a pointer to a struct that contains all of the block and compression info. Also, it only calls one other function which doesn't really consume any substantial processing time.
It can slow down the program, because the resulting code is larger and can lead to more misses of the CPU's instruction cache.
This is a complete WAG (Wild Ass Guess) based on near-perfect ignorance.
Could it be that for the inline version the optimizer is really busy juggling which values are in which registers and when? If that's the case, the procedure call version may give it room to devote more registers to what is happening in the loop.
As I said, just a WAG.
精彩评论