开发者

(c/c++) do copies of string literals share memory in TEXT section?

开发者 https://www.devze.com 2022-12-14 00:26 出处:网络
If I call a function like myObj.setType(\"fluid\"); many times in a program, how many copies of the literal \"fluid\" are saved in memory? Can the compiler recognize that this literal is already de开发

If I call a function like myObj.setType("fluid"); many times in a program, how many copies of the literal "fluid" are saved in memory? Can the compiler recognize that this literal is already de开发者_JS百科fined and just reference it again?


This has nothing to do with C++(the language). Instead, it is an "optimization" that a compiler can do. So, the answer yes and no, depending on the compiler/platform you are using.

@David This is from the latest draft of the language:

§ 2.14.6 (page 28)

Whether all string literals are distinct (that is, are stored in non overlapping objects) is implementation defined. The effect of attempting to modify a string literal is undefined.

The emphasis is mine.

In other words, string literals in C++ are immutable because modifying a string literal is undefined behavior. So, the compiler is free, to eliminate redundant copies.

BTW, I am talking about C++ only ;)


Yes, it can. Of course, it depends on the compiler. For VC++, it's even configurable:

http://msdn.microsoft.com/en-us/library/s0s0asdt(VS.80).aspx


Yes it can, but there's no guarantee that it will. Define a constant if you want to be sure.


This is a compiler implementation issue. Many compilers that I have used have an option to share or merge duplicate string literals. Allowing duplicate string literals speeds up the compilation process but produces larger executables.


I believe that in C/C++ there is no specified handling for that case, but in most cases would use multiple definitions of that string.


2.13.4/2: "whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation-defined".

This permits the optimisation you're asking about.

As an aside, there may be a slight ambiguity, at least locally within that section of the standard. The definition of string literal doesn't quite make clear to me whether the following code uses one string literal twice, or two string literals once each:

const char *a = "";
const char *b = "";

But the next paragraph says "In translation phase 6 adjacent narrow string literals are concatenated". Unless it means to say that something can be adjacent to itself, I think the intention is pretty clear that this code uses two string literals, which are concatenated in phase 6. So it's not one string literal twice:

const char *c = "a" "a";

Still, if you did read that "a" and "a" are the same string literal, then the standard requires the optimisation you're talking about. But I don't think they are the same literal, I think they're different literals that happen to consist of the same characters. This is perhaps made clear elsewhere in the standard, for instance in the general information on grammar and parsing.

Whether it's made clear or not, many compiler-writers have interpreted the standard the way I think it is, so I might as well be right ;-)

0

精彩评论

暂无评论...
验证码 换一张
取 消