开发者

How are variables in shared libraries referenced by loader?

开发者 https://www.devze.com 2023-03-12 13:31 出处:网络
I now understand how dynamic functions are referenced, by procedure linkage table like below: Dump of assembler code for function foo@plt:

I now understand how dynamic functions are referenced, by procedure linkage table like below:

Dump of assembler code for function foo@plt:
0x0000000000400528 <foo@plt+0>: jmpq   *0x2004d2(%rip)        # 0x600a00 <_GLOBAL_OFFSET_TABLE_+40>
0x000000000040052e <foo@plt+6>: pushq  $0x2
0x0000000000400533 <foo@plt+11>:    jmpq   0x4004f8
(gdb) disas 0x4004f8
No f开发者_C百科unction contains specified address.

But I don't know how dynamic variables are referenced,though I found the values are populated in the GOT once started,but there's no stub like above,how does it work?


The dynamic loader relocates all references to variables before transferring control to the user program.

There is no "stub" for them, because once the user program starts executing, it is not possible for the loader to regain control and update variable addresses. If this isn't clear to you, then you have not really understood how the PLT lazy-resolution stub works.


Global variables are accessed indirectly, via a global offset table.

  • When compiling a program, the compiler generates code that performs indirect accesses, and emits relocation information specifying the entry in the global offset table being used.
  • The linker performs these relocations when creating the final dynamically loadable object, resulting in machine code that does not need further patching at load time.

To see this in action, consider the following code fragment.

int v1;
int f(void) { return !v1; }

The function f references a global v1. The machine code generated for the function looks like the following (on an i386):

% gcc -c -fpic a.c
% objdump --disassemble --reloc a.o
[snip]
Disassembly of section .text:

00000000 <f>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   e8 fc ff ff ff          call   4 <f+0x4>
            4: R_386_PC32   __i686.get_pc_thunk.cx
   8:   81 c1 02 00 00 00       add    $0x2,%ecx
            a: R_386_GOTPC  _GLOBAL_OFFSET_TABLE_
   e:   8b 81 00 00 00 00       mov    0x0(%ecx),%eax
            10: R_386_GOT32 v1
  14:   8b 00                   mov    (%eax),%eax
  16:   85 c0                   test   %eax,%eax
  18:   0f 94 c0                sete   %al
  1b:   0f b6 c0                movzbl %al,%eax
  1e:   5d                      pop    %ebp
  1f:   c3                      ret    

Disassembly of section .text.__i686.get_pc_thunk.cx:

00000000 <__i686.get_pc_thunk.cx>:
   0:   8b 0c 24                mov    (%esp),%ecx
   3:   c3                      ret

Machine code walk-through:

  • (Offsets 0x0 and 0x1) The standard function prologue.
  • (Offset 0x3) The call to __i686.get_pc_thunk.cx prepares for PC-relative addressing by loading the address of the instruction after the call into register %ecx.
  • (Offset 0x8) The value in %ecx is adjusted to point to the start of the global offset table. This adjustment is signalled by the relocation entry of type R_386_GOTPC.
  • (Offset 0xE) The address of global v1 is retrieved. The R_386_GOT32 relocation supplies the offset of v1's entry from the base of the global offset table.
  • (Offset 0x14) The value in v1 is retrieved into register %eax.
  • (Offsets 0x16--0x1F) The rest of the computation for function f.

In the final shared object, the linker patches the function's code to the following:

% gcc -shared -o a.so a.o
% objdump --disassemble a.so
...snip...
0000044c <f>:
 44c:   55                      push   %ebp
 44d:   89 e5                   mov    %esp,%ebp
 44f:   e8 18 00 00 00          call   46c <__i686.get_pc_thunk.cx>
 454:   81 c1 a0 1b 00 00       add    $0x1ba0,%ecx
 45a:   8b 81 f8 ff ff ff       mov    -0x8(%ecx),%eax
 460:   8b 00                   mov    (%eax),%eax
 462:   85 c0                   test   %eax,%eax
...snip...
  • Assuming that the object gets loaded at offset O in memory, the call instruction at offset 0x44F will load O+0x454+0x1BA0, i.e., O+0x1FF4 into %ecx.
  • The instruction at offset 0x45A subtracts 8 from %ecx to get the address of the slot for v1 in the global offset table, i.e., the slot for v1 is at offset 0x1FEC from the start of the shared object.

Looking at the dynamic relocation records for the shared object, we see a relocation record instructing the runtime loader to store the actual address for v1 at offset 0x1FEC.

% objdump -R a.so
DYNAMIC RELOCATION RECORDS
OFFSET   TYPE              VALUE
...snip...
00001fec R_386_GLOB_DAT    v1
...snip...

Further reading:

  • Pat Beirne's "Study of ELF loading and relocs" has more information about ELF relocations.
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号