开发者

How do global variables contribute to the size of the executable?

开发者 https://www.devze.com 2023-01-28 03:00 出处:网络
Does having global variables increase the size of the executable? If yes how? Does it increase only the data section size or also the text section size?

Does having global variables increase the size of the executable? If yes how? Does it increase only the data section size or also the text section size?

If I have a global variable and initialization as below:

char g_glbarr[1024] = {"jhgdasdghaKJSDGksgJKASDGHKDGAJKsdghkajdgaDGKAjdghaJKSDGHAjksdghJKDG"};

Now, does this add 1024 to data section and the size of the initilization string to text section?

If instead if allocati开发者_运维百科ng space for this array statically, if I malloc it, and then do a memcpy, only the data section size will reduce or the text section size also will reduce?


Yes, it does. Basically compilers store them to data segment. Sometimes if you use a constant char array in you code (like printf("<1024 char array goes here");) it will go to data segment (AFAIK some old compilers /Borland?/ may store it in the text segment). You can force the compiler to put a global variable in a custom section (for VC++ it was #pragma data_seg(<segment name>)).

Dynamic memory allocation doesn't affect data/text segments, since it allocates memory in the heap.


The answer is implementation-dependent, but for sane implementations this is how it works for variables with static storage duration (global or otherwise):

  • Whenever the variable is initialized, the whole initialized value of the object will be stored in the executable file. This is true even if only the initial part of it is explicitly initialized (the rest is implicitly zero).
  • If the variable is constant and initialized, it will be in the "text" segment, or equivalent. Some systems (modern ELF-based, maybe Windows too?) have a separate "rodata" segment for read-only data to allow it to be marked non-executable, separate from program code.
  • Non-constant initialized variables will be in the "data" segment in the executable, which is mapped into memory in copy-on-write mode by the operating system when the program is loaded.
  • Uninitialized variables (which are implicitly zero as per the standard) will have no storage reserved in the executable itself, but a size and offset in the "bss" segment, which is created at program load-time by the operating system.
  • Such uninitialized variables may be created in a separate read-only "bss"-like segment if they're const-qualified.


I am not speaking as an expert, but I would guess that simply having that epic string literal in your program would increase the size of your executable. What you do with that string literal doesn't matter, because it has to be stored somewhere.

Why does it matter which "section" of the executable is increased? This isn't a rhetorical question!


The answer is slightly implementation sensitive, but in general, no. Your g_glbarr is really a pointer to char, or an address. The string itself will be put into the data section with constant strings, and g_glbarr will become a symbol for the address of the string at compile time. You don't end up allocating space for the pointer and the compiler simply resolves the address at link time.

Update

@Jay, it's sorta kinda the same. The integers (usually) just are in-line: the compiler will come as close as it can to just putting the constant in the code, because that's such a common case that most normal architectures have a straightforward way of doing it from immediate data. The string constants will still be in some read-only data section. So when you make something like:

// warning: I haven't compiled this and wouldn't normally
// do it quite this way so I'm not positive this is
// completely grammatical C
struct X {int a; char * b; } x = { 1, "Hello" } ; 

the 1 becomes "immediate" data, the "Hello" is allocated in read-only data somewhere, and the compiler will just generate something that allocates a piece of read-write data that looks something like

x:
x.a:   WORD    1
x.b    WORD    @STR42

where STR42 is a symbolic name for the location of the string "Hello" in memory. Then when everything is linked together, the @STR42 is replaced with the actual virtual address of the string in memory.

0

精彩评论

暂无评论...
验证码 换一张
取 消