开发者

This code is undefined, why is it running? How do i crash it?

开发者 https://www.devze.com 2023-02-11 10:35 出处:网络
I ran the code below in visual studios and expected to get a runtime or some kind of error. I got nothing, absolutely nothing. I got the output code 9, comment that line out and got 3. I ran it on cod

I ran the code below in visual studios and expected to get a runtime or some kind of error. I got nothing, absolutely nothing. I got the output code 9, comment that line out and got 3. I ran it on codepad and it gave me no errors as well.

Is there a compiler that will tell me this code is incorrect? If it is correct why is it? I know const A& is legal but AFAIK the below isnt.

class A
{
public:
    int v;
    A& get()
    {
        return *this;
    }
};

A& func()
{
    A a;
    a.v=3;
    return a.get();
}

int main()
{
开发者_StackOverflow社区    A& v = func();
    v.v=9;
    return v.v;
}


Undefined behavior is undefined behavior. You can't expect it to do anything in particular, including crash.

There is no compiler I know of that will catch all types of UB and I don't think it's possible. You could crank up the warning level of your compiler but I don't think it would even warn you then. Your use of get() as the way of capturing a reference to a local variable will, I believe, effectively hide the fact that this is what you're doing from most, if not all compilers. The amount of effort that would be required to catch such instances of suicide don't seem to me to be worth it.

That's just part of the life of a C++ developer.


It's working because the memory the instance was being stored in hadn't been overwritten yet. This obviously wouldn't fly in a real project.


Using a reference to a local object after it has gone out of scope is undefined behavior, and as such it does not require any diagnostic from the compiler.

1.3.12 undefined behavior

behavior, such as might arise upon use of an erroneous program construct or erroneous data, for which this International Standard imposes no requirements. Undefined behavior may also be expected when this International Standard omits the description of any explicit definition of behavior. [Note: permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).


You ask why can't it be detected at runtime; how can that be? The only way would be to check, at every memory access done via a pointer/reference whose value is in the range currently used for the stack, that such address is below the current stack pointer.

This would be a hugely costly operation (in terms both of time spent and executable size increase) since it would have to be done for every memory access, so it's not done.

On the other hand, the checks for the arrays you described work (AFAIK) by checking if a flag value put between stack frames has been overwritten, and such check is performed only when the function returns (by the way, such checks in VC++ can be enabled only in unoptimized builds).

Another kind of check that the compiler can do is by static analysis of the code; this is not perfect, but often works fine, and warns you if you do something nonsensical as directly returning a reference/pointer to a local variable; in this case it didn't warn you because your example is quite convoluted, and the static analysis didn't manage to catch it.


I think you're making a big logical mistake. Undefined Behavior doesn't mean that a program will crash... it means that anything could happen.

If you're lucky (very very lucky) then you get a crash. What normally happens instead is that if you do anything that implies undefined behavior simply the program keeps running as if nothing happened, until one million executed instructions later where a perfectly legal piece of code does something very crazy. Normal reactions from programmers is then to blame the compiler, the OS version, the defective RAM and voodoo dolls hidden by hostile colleagues in the drawer.

If you are just a little unlucky instead the program will just behave exactly as you would expect, including providing the result you expect from it and closing fine without any problems at all. All this until you get to the big demo day, when instead it will crash badly in front of the audience just after you say "And now let's save our work...".

But why isn't undefined behavior checked in C++?

One of the main philosophical foundation of C++ is simply that programmers make no error. This means that when a programmer does indeed make an error there is no "runtime error angel" that will come to help, just "undefined behavior daemons" that instead will try to bite.

This has been done to avoid leaving enough space for another language between C++ and assembler, so it must be possible to write efficient code, and runtime error angels are too heavy to carry around. While for sure it's easy to write bloated and slow code in C++ it's also possible to write efficient code if you have a good grasp of how the language works and by keeping a constant focus to performance in mind.

When you see "this is undefined behavior" simply the meaning is that the compiler writers are free to ignore whatever is going to happen. Checking that those rules are not violated is a burden on the programmers that are using C++, not on the C++ compiler.

In my opinion the very fact that "undefined behavior" means that's unpredictable what happens and the fact that's very very easy to get undefined behavior by mistake means that C++ is a terrible language to learn by experimentation, because when you make a mistake the system won't tell you clearly so. It's also in my opinion a terrible language for a beginner (because it's natural for beginners to do more mistakes).

The only reasonable path to C++ is:

  1. Learning it by studying and not by experimenting

    C++ is a complex language with a long evolution history. In some parts it's illogical because of historical accidents. Even if you're smart you will never be able to guess the historical reasons for an apparently illogical choice. History must be studied.

  2. You must think very carefully at every single statement you write

    Like I said before you can't expect C++ to detect all your mistakes. Anything can happen when you make a mistake (including nothing!) and this means that debugging can be very very hard. The only viable option is to try to not introduce bugs. Writing code without serious thinking and hoping that tests and debug will find them is IMO bad for any language, but a true suicidal approach to C++.


To give an example why this is definetly undefined behaviour; change the following in your code:

#include <iostream>
// ...
int main() {
    A& v = func();
    v.v=9;

    int over[9000] = {1};
    std::cout << v.v;

    return 0;
}

At least for me (and consistently), that overwrote the memory stored in v.v. But it may not be consistent for others, because how memory is handled is probably implementation Dependant. It should give you an idea however.


The compiler can't tell that it's invalid code. Returning *this as a reference would be OK if the reference lifetime was shorter than the lifetime of the object being returned. Determining which has the longer lifetime at runtime is beyond the capabilities of the compiler at compile-time, since the lifetime can in general depend on what happens at run-time.

I suppose a sufficiently clever compiler (or more likely, lint tool) could put in checks for certain special cases, possibly including the case you give here. The question is whether it's worth implementing such a check when it will only catch obvious cases anyway.


In addition to what everyone said, the typical stack implementation on a modern OS allocates stack in 4-8KB increments (pages). Stack usage at program startup is typically small, the object data exists several bytes past the top of stack. Even if only one stack page is allocated, there's a decent chunk of prefectly read-writeable space past the stack top. So reading from that memory does not cause a runtime error.

But yes, it's an undefined behavior.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号