开发者

Isn't this legacy code which returns a local char array wrong?

开发者 https://www.devze.com 2022-12-22 11:27 出处:网络
I came across some legacy code that contains a function like this: LPCTSTR returnString() { char buffer[50000];

I came across some legacy code that contains a function like this:

LPCTSTR returnString()
{
    char buffer[50000];
    LPCTSTR t;

    /*Some code here that copies some string into buffer*/

    t = buffer;
    return t; 
}

Now, I strongly suspect that this is wrong. I tried calling the function and it does return the string you expect it to return. However, I don't really see how that happens: isn't the char array supposed to be stored on the stack, and thus deallocated after the function exits? If I'm wrong, and it gets stored on the heap, isn't this fu开发者_高级运维nction creating a memory leak?


Your code is exhibiting undefined behaviour - in this case, the UB is that it appears to "work". If you want the array stored on the heap, you need to allocate it with new[]. The caller of the function will then be responsible for deleting it via the pointer the function returns.


No memory leak, but the function is still wrong - the buffer is indeed created on the stack, and the fact that the caller is able to read it is just luck (it's accessible only right after the call to returnString() function.) That buffer might be overwritten by any further function calls or other stack manipulation.

The proper way to pass data up the call chain is to provide a buffer and the size to a function to fill.


You're correct, this isn't guaranteed to work; and in fact, if you truly store a 50,000 character string in this buffer, then call some function (which calls a function, which calls a function..) after this, in almost every system this string will be corrupted, due to the function's stack-frame being pushed onto the stack.

The only reason it appears to work is that stale stack memory (on most systems) isn't cleared after a function returns.

[Edit] To clarify, it appears to be working because the stack hasn't had the chance to grow 50,000 bytes. Try changing it to char buffer[10]; and call some functions after returnString(), and you should see it corrupt.


Your feeling is correct; the code is very wrong.

The memory is indeed on the stack, and goes away with the function end.

If the buffer were static then you could at least expect it to work for one call at a time (in a single-threaded app).


There is very little difference between an array and a pointer in C/C++. So the statement:

t = buffer;

Actually works because "buffer" means the address of the array. The address is not explicitly stored in memory though until you put it in t (i.e. buffer is not a pointer). buffer[n] and t[n] will reference the same element of the array. Your array is allocated on the stack though, so the memory is freed - not cleared - when the function returns. If you look at it before it gets overwritten by something else then it will appear fine.


this code returns a pointer to memory allocated on the stack. it's very dangerous because if you try to pass this pointer to another function the memory will be overwritten by the second function call.

instead of this, you can use a static buffer:

static char buffer[50000];

which is not allocated on the stack so a pointer to it remains valid. (this is not thread safe, obviously).


You are right, code is wrong. :)

You should study memory allocation in C/C++. Data can reside in two areas: stack and heap. Local variables are stored in stack. malloced and newed data is stored in heap. State of stack is local to function - variables live in stack frame - the container that will be freed when function returns. So pointers become broken.

Heap is global, so all data is stored there until explicitly deleted or freed by programmer. You can rely on that area.


While everyone's thoughts that the behavior is undefined, and in this case it appears to be true, it is important in cases like this to consider other possibilities.

For example, an overloaded operator=(const char*) could be behind the scenes allocating the requirement memory. Although this isn't the case (to the best of my knowledge) with the Microsoft typedefs, it is an important thing to be aware of in cases like this.

In this case, however, it does seem to just be convenient that it is working, and it is certainly not the guaranteed behavior. As others have noted, this really ought to be fixed.


This is a dangerous bug lurking in your code. In C and C++ you are not allowed to return a pointer to stack data in a function. It results in undefined behavior. I'll explain why.

A C/C++ program works by pushing data on and off the program stack. When you call a function, all the parameters are pushed onto the stack, and then all the local variables are pushed onto the stack as well. As the program executes it may push and pop more items onto and off the stack in your function. In your example, buffer is pushed onto the stack, and then t is pushed onto the stack. The stack might look like,

  • Previous Stack
  • Parameters
  • (other data)
  • buffer (50000 bytes)
  • t (sizeof pointer)

At this point, t is on the stack, and it points to buffer, which is also on the stack. When the function returns, the runtime pops all the variables on the stack off, which includes t, buffer and the parameters. In your case, you return the pointer t, thus making a copy of it in the calling function.

If the calling function then looks at what t points to, it is going to find that it points to memory on the stack that may or may not exist. (The runtime popped it off the stack, but the data on the stack may still be there by coincidence, maybe not).

The good news is, it's not hopeless. There are automated tools that can search for these kinds of errors in your software and report them. They are called static analysis tools. Sentry is one such example of a program that can report this kind of defect.


I agree that what is most likley going on here is the incorrect return of an address of an automatic variable, however, I echo KevenK that this is not guaranteed if it is C++ as the tag specifies. We do not know what LPCTSTR is. What if an included header contains something like this:

(yes I know this leaks, not the point)


class LPCTSTR{
private:
  char * foo;

public:
  LPCTSTR & operator=(char * in){
    foo = strdup(in);
  }

  const char * getFoo(){
    return foo;
  }


};

0

精彩评论

暂无评论...
验证码 换一张
取 消