开发者

obfuscated C/asm "Hello, world!" program, I don't understand

开发者 https://www.devze.com 2023-01-04 09:53 出处:网络
why does the following code print \"Hello, world!\" (on \"my\" system)? .file\"test.c\" .globl main .data

why does the following code print "Hello, world!" (on "my" system)?

        .file   "test.c"
.globl main
        .data
        .align 32
        .type   main, @object
        .size   main, 56
main:
        .value  3816
        .value  0
        .value  18432
        .value  27749
        .value  28524
        .value  8236
        .value  28535
        .value  27762
        .value  8548
        .value  -29942
        .value  9228
        .value  7305
        .value  -17884
        .value  14
        .value  0
  开发者_StackOverflow      .value  20818
        .value  443
        .value  0
        .value  21248
        .value  1208
        .value  0
        .value  -32000
        .value  1260
        .value  -32563
        .value  -15229
        .value  23312
        .value  -16335
        .value  -28477

also, what does .value mean, and how will it be translated to machine code?


I assembled it and then disassembled it, and here's what I got. I did not run it though, because I know better than to run random pieces of assembly language I find on the net.

The code starts by branching to the print code:

00000000 <main>:
   0:   e8 0e 00 00 00          call   13 <main+0x13>

Then follows the ASCII for "Hello, World!\n":

   5:   48 65 6c 6c 6f 2c 20
   c:   77 6f 72 6c 64 21 0a

The code jumped to in the first instruction:

  13:   8b 0c 24                mov    (%esp),%ecx
  16:   89 1c 24                mov    %ebx,(%esp)
  19:   ba 0e 00 00 00          mov    $0xe,%edx
  1e:   52                      push   %edx
  1f:   51                      push   %ecx
  20:   bb 01 00 00 00          mov    $0x1,%ebx
  25:   53                      push   %ebx
  26:   b8 04 00 00 00          mov    $0x4,%eax
  2b:   83 ec 04                sub    $0x4,%esp
  2e:   cd 80                   int    $0x80
  30:   83 c4 10                add    $0x10,%esp
  33:   5b                      pop    %ebx
  34:   31 c0                   xor    %eax,%eax
  36:   c3                      ret
  37:   90                      nop

FWIW, the method used was to paste your code into a file, foo.S, then:

gcc -S foo.S
objdump -D foo.o


This looks like assembly code. Some key things to point out are

  1. ".globl main" : This directive tells the assmbler to make the symbol for main visible outside of this assembly file.
  2. "main:" : This creates the symbol main, which the linker knows to look for when linking
  3. ".value" : This causes the assembler to emit a byte with the corresponding value.

Given this the remaining trick is to convert the bytes that are emmited into the instructions that they correspond to. This can be done using the opcode map in the "Intel 64 and IA-32 Architectures Software Developer's Manual: Volume 2B", available from here


.value is raw memory data. They are machine codes themselves. The file contains program not in human readable mnemonics but as decimal numbers (looks like two byte signed numbers) If you want to see program as readable code, compile it and use any disassembler.


.value isn't translated to machine code. It's an assembler directive.

The .globl directive exports the main symbol. This is typical, because main needs to be publically visible to the linker. The .data directive begins the data segment of the executable and then declares main to be an object of size 56, as there are 28 .value statements, each representing a 16-bit value.

The numbers specified after .value appear to be the raw binary values representing already translated instructions. It would require knowledge of your operating system and processor to further determine their meaning.

Hope that helps.

0

精彩评论

暂无评论...
验证码 换一张
取 消