开发者

What's a Good Way to Test that Identifiers aren't Being Truncated and Thereby Mixed Up?

开发者 https://www.devze.com 2023-04-04 05:22 出处:网络
In C++ class today, we discussed the maximum possible length of identifiers, and how the compiler will eventually stop treating variables as different, after a certain length. (My professor seems to h

In C++ class today, we discussed the maximum possible length of identifiers, and how the compiler will eventually stop treating variables as different, after a certain length. (My professor seems to have implied that really long identifiers are truncated.) I posted another question earlier, hoping to see if the limit is defined somewhere. My question here is a little different. Suppose I wanted to test either a practical or enforced limit on identifier name lengths. How would I go about doing so? Here's what I'm thinking of doing, but somehow it seems to be too simple.

  • Step 1: Generate at least two variables with really long names and print them to the console. If the identifier names are really that unlimited, I am not going to waste time typing them. My code should do it for me.
  • Step 2: Attempt to perform some operations with the variables, such as compare them, or 开发者_StackOverflowany arithmetic. If the compiler stops differentiating, then in theory, certain arithmetic will break, such as x/(reallyLongA-reallyLongB), since reallyLongA and reallyLongB will be so long that the compiler will just treat them as the same thing. At that point, the division operation will become a division-by-zero, which should crash and burn horribly.

Am I approaching this correctly? Will I run out of memory before I "break" the compiler or "runtime"?


I don't think you need to even generate any operations on the variables.

The following code will generate a redefinition error at compilation time;

int name;
int name;

I'd expect you'd get the same error with

int namewithlastsignificantcharacterhere_abc;
int namewithlastsignificantcharacterhere_123;

I'd use a macro scripting language to generate successively longer names until you got one that broke. Here's a Ruby one-liner

C:>ruby -e "(1..2048).each{|i| puts \"int #{'variable'*i}#{i};\"}" > var.txt

When I #include var.txt in a c file, and compile with VS2008, I get the error

"1>c:\code\quiz\var.txt(512) : fatal error C1064: compiler limit : token overflowed internal buffer"

and 512*8 chars is the 4096 that JRL cited.


Your professor is wrong. § 2.11/1 of the C++ standard says: "All characters are significant". Certainly compilers may impose a limit on the allowed length, as noted in your other question. That doesn't mean they can ignore characters after that.

He's probably confusing C and C++. The two languages have similar but not identical rules. Historically, C had limits as low as six significant characters.

As for your test, there's a far simpeler way to test your hypothesis. Note that

int a;
int a;

is illegal, because you define the same identifier twice. Now if ReallyLongNameA and ReallyLongNameB would differ only in non-significant characters, then

int ReallyLongNameA;
int ReallyLongNameB;

would also be a compile-time error, because both would declare the same variable. You don't need to run the code. You can just generate test.cpp with those two lines, and try to compile it. So, write a small test program that creates increasingly long identifier names, write them to test.cpp, and call system("path/to/compiler -compileroptions test.cpp"); to see if it compiles.


For Windows C++:

Only the first 2048 characters of Microsoft C++ identifiers are significant. Names for user-defined types are "decorated" by the compiler to preserve type information. The resultant name, including the type information, cannot be longer than 2048 characters.

Thus seems you could do a pretty simple test using a MS compiler, at least.

Edit: Didn't do extensive testing, but on my Visual Studio Pro 2008 at least, a variable named aaaa... (total length 4095 characters) compiles, and after that (>= 4096 you get Fatal Error C1064: compiler limit : token overflowed internal buffer).


I would assume that if it still works after the length reaches some ridiculous size (like > 1MB), that the compiler probably is able to handle arbitrary sized identifiers.

Of course there's no sure way to tell as it is entirely possible for the identifier length limit to exceed the amount of memory you have. (a limit of 2^32 - 1 is entirely possible)

0

精彩评论

暂无评论...
验证码 换一张
取 消