Can I choose RIP-relative or absolute addressing for different variables with gcc in x86-64_问答_开发者

Can I choose RIP-relative or absolute addressing for different variables with gcc in x86-64

开发者 https://www.devze.com 2023-02-15 23:47 出处：网络

I write my own link script to put different variables in two different data sections (A & B). A is linked to zero address;

I write my own link script to put different variables in two different data sections (A & B).

A is linked to zero address; B is linked near to code, and开发者_开发技巧 in high address space (higher than 4G, which is not available for normal absolute addressing in x86-64).

A can be accessed through absolute addressing, but not RIP-relative; B can be accessed through RIP-relative addressing, but not absolute;

My question: Is there any way to choose RIP-relative or absolute addressing for different variables in gcc? Perhaps with some annotation like #pragma?

Without hacking the GCC source code, you're not going to get it to emit 32-bit absolute addressing, but there are cases where gcc will use 64-bit absolute addresses.

-mcmodel=medium puts large objects into a separate section, using 64-bit absolute addresses for the large-data section. (With a size threshold that all objects have to agree on, set by -mlarge-data-threshold=). But still uses RIP-relative for all other variables.

See the x86-64 System V ABI doc for more about the different memory models. And/or GCC docs for -mcmodel= and -mlarge-data-threshold= : https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html
The default is -mcmodel=small : everything is within 2GiB of everything else, so RIP-relative works. And for non-PIE executables, that's the low 2GiB of virtual address space so static addresses can be 32-bit absolute sign- or zero-extended immediates or disp32 in addressing modes.

int a[1000000];
int b[1];

int fa() {   return a[0];  }
int fb() {   return b[0];  }

ASM output (Godbolt):

# gcc9.2 -O3 -mcmodel=medium
fa():
        movabs  eax, DWORD PTR [a]     # 64-bit absolute address, special encoding for EAX
        ret
fb():
        mov     eax, DWORD PTR b[rip]
        ret

For loading into a register other than AL/AX/EAX/RAX, GCC would use movabs r64, imm64 with the address and then use mov reg, [reg].

You won't get gcc to use 32-bit absolute addressing for section A. It will always be using 64-bit absolute, never [array + rdx*4] or [abs foo] (NASM syntax). And never mov edi, msg (imm32) for putting an address in a register, always mov rdi, qword msg (imm64).

GCC puts b in the .lbss section and a in the regular .bss. Presumably you can use __attribute__((section("name"))) on

        .globl  b
        .section        .lbss,"aw"           # "aw" = allocate(?), writeable
        .align 32
        .size   b, 4000000
b:
        .zero   4000000

        .globl  a
        .bss                      # shortcut for .section
        .align 4
a:
        .zero   4

Things that don't work:

__attribute__((optimize("mcmodel=large"))) on a per-function basis. Doesn't actually work, and is per-function not per-variable anyway.
https://gcc.gnu.org/onlinedocs/gcc/Variable-Attributes.html doesn't document any x86 or common variable attributes related to memory-model or size. The only x86-specific variable attribute is ms vs gcc struct layout.

There are x86-specific attributes for functions and types, but those don't help.

Possible hacks:

Put all your section-A variables in a large struct, larger than any section-B global/static objects. Possibly pad it at the end with a dummy array to make it larger: your linker script can probably avoid actually allocating extra space for that dummy array.

Then compile with -mcmodel=medium mlarge-data-threshold=that size.