Recently I have gotten interested into dis-assembling C code (very simple C code) and followed a tutorial that used Borland C++ Compiler v 5.5 (compiles C code just fine) and everything worked. Then I decided to try my own c code and compiled them in Dev C++ (which uses gcc). Upon opening it in IDA Pro I got a surprise, the asm of gcc was really different compared to Borland's. I expected some difference but the C code was EXTREMELY simple, so is it just that gcc doesn't optimize as much or is it that they use different default compiler settings?
The C Code
int main(int argc, char **argv)
{
int a;
a = 1;
}
Borland ASM
.text:00401150 ; int __cdecl main(int argc,const char **argv,const char *envp)
.text:00401150 _main proc near ; DATA XREF: .data:004090D0
.text:00401150
.text:00401150 argc = dword ptr 8
.text:00401150 argv = dword ptr 0Ch
.text:00401150 envp = dword ptr 10h
.text:00401150
.text:00401150 push ebp
.text:00401151 mov ebp, esp
.text:00401153 pop ebp
.text:00401154 retn
.text:00401154 _main endp
GCC ASM (UPDATED BELLOW)
.text:00401220 ; ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦ S U B R O U T I N E ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦
.text:00401220
.text:00401220 ; Attributes: bp-based frame
.text:00401220
.text:00401220 public start
.text:00401220 start proc near
.text:00401220
.text:00401220 var_14 = dword ptr -14h
.text:00401220 var_8 = dword ptr -8
.text:00401220
.text:00401220 push ebp
.text:00401221 mov ebp, esp
.text:00401223 sub esp, 8
.text:00401226 mov [esp+8+var_8], 1
.text:0040122D call ds:__set_app_type
.text:00401233 call sub_401100
.text:00401238 nop
.text:00401239 lea esi, [esi+0]
.text:00401240 push ebp
.text:00401241 mov ebp, esp
.text:00401243 sub esp, 8
.text:00401246 mov [esp+14h+var_14], 2
.text:0040124D call ds:__set_app_type
.text:00401253 call sub_401100
.text:00401258 nop
.text:00401259 lea esi, [esi+0]
.text:00401259 start endp
GCC Update Upon following the suggestion of JimR I went to see what sub_401100 is and then I followed that code to another and this seems to be the code (Am I correct in that assumption and if sowhy does GCC have all of its code in the main function?):
.text:00401100 sub_401100 proc near ; CODE XREF: .text:004010F1j
.text:00401100 ; start+13p ...
.text:00401100
.text:00401100 var_28 = dword ptr -28h
.text:00401100 var_24 = dword ptr -24h
.text:00401100 var_20 = dword ptr -20h
.text:00401100 var_1C = dword ptr -1Ch
.text:00401100 var_18 = dword ptr -18h
.text:00401100 var_C = dword ptr -0Ch
.text:00401100 var_8 = dword ptr -8
.text:00401100
.text:00401100 push ebp
.text:00401101 mov ebp, esp
.text:00401103 push ebx
.text:00401104 sub esp, 24h ; lpTopLevelExceptionFilter
.text:00401107 lea ebx, [ebp+var_8]
.text:0040110A mov [esp+28h+var_28], offset sub_401000
.text:00401111 call SetUnhandledExceptionFilter
.text:00401116 sub esp, 4 ; uExitCode
.text:00401119 call sub_4012E0
.text:0040111E mov [ebp+var_8], 0
.text:00401125 mov eax, offset dword_40400开发者_JAVA百科0
.text:0040112A lea edx, [ebp+var_C]
.text:0040112D mov [esp+28h+var_18], ebx
.text:00401131 mov ecx, dword_402000
.text:00401137 mov [esp+28h+var_24], eax
.text:0040113B mov [esp+28h+var_20], edx
.text:0040113F mov [esp+28h+var_1C], ecx
.text:00401143 mov [esp+28h+var_28], offset dword_404004
.text:0040114A call __getmainargs
.text:0040114F mov eax, ds:dword_404010
.text:00401154 test eax, eax
.text:00401156 jz short loc_4011B0
.text:00401158 mov dword_402010, eax
.text:0040115D mov edx, ds:_iob
.text:00401163 test edx, edx
.text:00401165 jnz loc_4011F6
.text:004012E0 sub_4012E0 proc near ; CODE XREF: sub_401000+C6p
.text:004012E0 ; sub_401100+19p
.text:004012E0 push ebp
.text:004012E1 mov ebp, esp
.text:004012E3 fninit
.text:004012E5 pop ebp
.text:004012E6 retn
.text:004012E6 sub_4012E0 endp
Compiler output is expected to be different, sometimes dramatically different for the same source. In the same way that a toyota and a honda are different. Four wheels and some seats sure, but more different than the same when you look at the details.
Likewise the same compiler with different compiler options can and often will produce dramatically different output for the same source code. Even for what appears to be simple programs.
In the case of your simple program, which actually does not do anything (code does not affect the input, nor output, nor anything outside the function), a good optimized compiler will result in nothing but main: with a return of some random number since you didnt specify the return value. Actually it should give a warning or error. This is the biggest problem I have when I compare compiler output is making something simple enough to see what they are doing but something complicated enough that the compiler does more than just pre-compute the answer and return it.
In the case of x86, which I assume is what you are talking about here, being microcoded these days there is really no answer for good code vs bad code, each family of processor they change the guts around and what used to be fast is slow and what is now fast is slow on the old processor. So for compilers like gcc that have continued to evolve with the new cores, the optimization can be both generic to all x86es or specific to a particular family (resulting in different code despite max optimization).
With your new interest in disassembling, you will continue to see the similarities and differences and find out just how many different ways the same code can be compiled. the differences are expected, even for trivial programs. And I encourage you to try as many compilers as you can. Even in the gcc family 2.x, 3.x, 4.x and the different ways to build it will result in different code for what might be though thought of as the same compiler.
Good vs bad output is in the eyes of the beholder. Folks that use debuggers will want their code steppable and their variables watchable (in written code order). This makes for very big, bulky, and slow code (particularly for x86). And when you compile for release you end up with a completely different program which you have so far spent zero time debugging. Also optimizing for performance you take a risk of the compiler optimizing out something you wanted it to do (your example above, no variable will be allocated, no code to step through, even with minor optimization). Or worse, you expose the bugs in the compiler and your program simply doesnt work (this is why -O3 is discouraged for gcc). That and/or you find out the large number of places in the C standard whose interpretation is implementation defined.
Unoptimized code is easier to compile, as it is a bit more obvious. In the case of your example the expectation is a variable is allocated on the stack, some sort of stack pointer arrangement set up, the immediate 1 is eventually written to that location, stack cleaned up and function returns. Harder for compilers to get wrong and more likely that your program works as you intended. Detecting and removing dead code is the business of optimization and that is where it gets risky. Often the risk is worth the reward. But that depends on the user, beauty is in the eye of the beholder.
Bottom line, short answer. Differences are expected (even dramatic differences). Default compile options vary from compiler to compiler. Experiment with the compile/optimization options and different compilers and continue to disassemble your programs in order to gain a better education about the language and the compilers you use. You are on the right track so far. In the case of the borland output, it detected that your program does nothing, no input variables are used, no return variables are used, nor related to the local variables, and no global variables or other external to the function resources are used. The integer a and the assignment of an immediate are dead code, a good optimizer will essentially remove/ignore both lines of code. So it bothered to setup a stack frame then clean it up which it didnt need to do, then returned. gcc looks to be setting up an exception handler which is perfectly fine even though it doesnt need to, start optimizing or use a function name other than main() and you should see different results.
What is most likely happening here is that Borland calls main from its start up code after initializing everything with code present in their run time lib.
The gcc code does not look like main to me, but like generated code that calls main. Disassemble the code at sub_401100 and see if it looks like your main proc.
First of all, make sure you have at least enabled the -O2 optimization flag to gcc, otherwise you get no optimization at all.
With this little example, you arn't really testing optimization, you're seeing how program initialization works, e.g. gcc calls __set_app_type to inform windows of the application type, as well as other initialization. e.g. sub_401100 registers atexit handlers for the runtime. Borland might call the runtime initialization beforehand, while gcc does it within main().
Here's the disassembly of main()
that I get from MinGW's gcc 4.5.1 in gdb (I added a return 0
at the end so GCC wouldn't complain):
First, when the program is compiled with -O3 optimization:
(gdb) set disassembly-flavor intel
(gdb) disassemble
Dump of assembler code for function main:
0x00401350 <+0>: push ebp
0x00401351 <+1>: mov ebp,esp
0x00401353 <+3>: and esp,0xfffffff0
0x00401356 <+6>: call 0x4018aa <__main>
=> 0x0040135b <+11>: xor eax,eax
0x0040135d <+13>: mov esp,ebp
0x0040135f <+15>: pop ebp
0x00401360 <+16>: ret
End of assembler dump.
And with no optimizations:
(gdb) set disassembly-flavor intel
(gdb) disassemble
Dump of assembler code for function main:
0x00401350 <+0>: push ebp
0x00401351 <+1>: mov ebp,esp
0x00401353 <+3>: and esp,0xfffffff0
0x00401356 <+6>: sub esp,0x10
0x00401359 <+9>: call 0x4018aa <__main>
=> 0x0040135e <+14>: mov DWORD PTR [esp+0xc],0x1
0x00401366 <+22>: mov eax,0x0
0x0040136b <+27>: leave
0x0040136c <+28>: ret
End of assembler dump.
These are a little more complex than Borland's example, but not excessively.
Note, the calls to 0x4018aa
are calls to a library/compiler supplied function to construct C++ objects. Here's a snippet from some GCC toolchain docs:
The actual calls to the constructors are carried out by a subroutine called __main, which is called (automatically) at the beginning of the body of main (provided main was compiled with GNU CC). Calling __main is necessary, even when compiling C code, to allow linking C and C++ object code together. (If you use '-nostdlib', you get an unresolved reference to __main, since it's defined in the standard GCC library. Include '-lgcc' at the end of your compiler command line to resolve this reference.)
I'm not sure what exactly IDA Pro is showing in your examples. IDA Pro labels what it's showing as start
not main
so I'd guess that JimR's answer is right - it's probably the runtime's initialization (perhaps the entry point as described in the .exe header - which is not main()
, but the runtime initialization entry point).
Does IDA Pro understand gcc's debug symbols? Did you compile with the -g
option so the debug symbols are generated?
It looks like the Borland compiler is recognizing that you never actually do anything with a
and is just giving you the equivalent assembly for an empty main function.
Difference here is mosly not in compiled code, but in what disassembler shows to you. You may think that main is the only function in your program but it is not. In fact your program is something like this:
void start()
{
... some initialization code here
int result = main();
... some deinitialization code here
ExitProcess(result);
}
IDA Pro knows how Borland works, so it can navigate directly to your main, but it doesn't know how gcc works so it shows you the true entry point of your program. You can see in Borland ASM that main is called from some other function. In GCC ASM you can go thru all of these sub_40xxx to find your main
精彩评论