开发者

Understanding C-strings & string literals in C++

开发者 https://www.devze.com 2023-04-11 06:22 出处:网络
I have a few questions I would like to ask about string literals and C-strings. So if I have something like this:

I have a few questions I would like to ask about string literals and C-strings.

So if I have something like this:

char cstr[] = "c-string";

As I understand it, the string literal is created in memory with a terminating null byte, say for example starting at address 0xA0 and ending at 0xA9, and from there the address is returned and/or casted to type char [ ] which then points to the address.

It is then legal to perform this:

for (int i = 0; i < (sizeof(array)/sizeof(char)); ++i)
    cstr[i] = 97+i;

So in this sense, are string literals able to be modified as long as they are casted to the type char [ ] ?

But with regular pointers, I've come to开发者_开发百科 understand that when they are pointed to a string literal in memory, they cannot modify the contents because most compilers mark that allocated memory as "Read-Only" in some lower bound address space for constants.

char * p = "const cstring";
*p = 'A'; // illegal memory write

I guess what I'm trying to understand is why aren't char * types allowed to point to string literals like arrays do and modify their constants? Why do the string literals not get casted into char *'s like they do to char [ ]'s? If I have the wrong idea here or am completely off, feel free to correct me.


The bit that you're missing is a little compiler magic where this:

char cstr[] = "c-string"; 

Actually executes like this:

char *cstr = alloca(strlen("c-string")+1);
memcpy(cstr,"c-string",strlen("c-string")+1);

You don't see that bit, but it's more or less what the code compiles to.


char cstr[] = "something"; is declaring an automatic array initialized to the bytes 's', 'o', 'm', ...

char * cstr = "something";, on the other hand, is declaring a character pointer initialized to the address of the literal "something".


In the first case you are creating an actual array of characters, whose size is determined by the size of the literal you are initializing it with (8+1 bytes). The cstr variable is allocated memory on the stack, and the contents of the string literal (which in the code is located somewhere else, possibly in a read-only part of the memory) is copied into this variable.

In the second case, the local variable p is allocated memory on the stack as well, but its contents will be the address of the string literal you are initializing it with.

Thus, since the string literal may be located in a read-only memory, it is in general not safe to try to change it via the p pointer (you may get along with, or you may not). On the other hand, you can do whatever with the cstr array, because that is your local copy that just happens to have been initialized from the literal.

(Just one note: the cstr variable is of a type array of char and in most of contexts this translates to pointer to the first element of that array. Exception to this may be e.g. the sizeof operator: this one computes the size of the whole array, not just a pointer to the first element.)


char cstr[] = "c-string";

This copies "c-string" into a char array on the stack. It is legal to write to this memory.

char * p = "const cstring";
*p = 'A'; // illegal memory write

Literal strings like "c-string" and "const cstring" live in the data segment of your binary. This area is read-only. Above p points to memory in this area and it is illegal to write to that location. Since C++11 this is enforced more strongly than before, in that you must make it const char* p instead.

Related question here.

0

精彩评论

暂无评论...
验证码 换一张
取 消