I would like to massively use inline
in my project to speed up pe开发者_JAVA百科rformance.
As far as I know the compiler might apply inline or not; it is up to the compiler.
It is not clear to me what I can do to make this possible, but before going that direction, do you know a way to check that inlinining really occurred or not in the output binary?
Use gcc -Winline
to get warnings when an inline function is not inlined.
Use __attribute__ ((always_inline))
to force functions to be inlined.
Having said that, be warned that you can screw up performance, compile time and get huge code bloat if you use inlining injudiciously.
If you are using the MS compiler you might want to enable warning C4710 to get a warning for functions not inlined.
Use the gcc -S
option to generate assembler output, and then inspect the output in your favourite text editor.
But, the compiler is often a better judge than you of when inlining will actually improve performance. Don't be too hasty to force it; profile your code and see if inlining actually is faster.
The compiler is probably smarter about this than you are, but ignoring that, assuming you don't have any special compiler flags enabled, you can dump the name list and find if the function has been generated.
static int foo(int x)
{
return(x*x);
}
main()
{
int x=1;
foo(x);
}
To test
not seth> gcc -o /tmp/foo /tmp/main1.c
not seth> nm /tmp/foo | grep foo
00000000004004c4 t foo
not seth> gcc -O -o /tmp/foo /tmp/main1.c
not seth> nm /tmp/foo | grep foo
The inline
keyword has actually little to do with optimization. Most compilers will inline a function call (the function itself may have to be compiled separately, eg. if you take its address somewhere else) regardless of whether the inline
keyword is present or not.
In fact, even if one called function is in another translation unit, a clever linker may inline it at link time (MSVC provides this feature as "link time code generation"). It requires strong cooperation between the compiler and the linker though.
The raison d'être of the inline
keyword is to allow [non template] functions to break the One Definition Rule, and thus to be defined in header files. The actual inlining of the function will be decided by the compiler based on various heuristics and optimization flags passed to it, and not based on the inline
keyword.
So massively using inline
will probably do absolutely nothing about performance. If you're worried about performance, use a profiler to determine where your program spends its time (often where you don't expect it to), and act accordingly, by optimizing the actual bottleneck.
精彩评论