开发者

Investigating a bad free() pointer reference

开发者 https://www.devze.com 2023-03-09 14:44 出处:网络
My program is crashing badly in an unitary test, due to a free() call to a bad pointer, with this error message:

My program is crashing badly in an unitary test, due to a free() call to a bad pointer, with this error message:

*** glibc detected *** /home/user/main.out: free(): invalid pointer: 0x006d0065 ***

The code looks roughly like this:

// constructor is MyObj( const std::string & ) 
// and copies the string in its own std:string member
MyObject *obj = new MyObject("some string arg");
if(obj->isValid)
{
   log("Success\n");
}
delete obj; // if I remove this, the program doesn't crash...

My investigations have gone through this:

  1. I tried to trace the pointers I create and free, but it occurs in a different address space, I get (successful) deletions like this: Delete buffer @0x850ed0. So I suspect something's trying to free a static char pointer. Also, the address stays the same over several attempts, which comforts me in this predicate.

  2. I can't use GDB because the platform I'm using seems to bail out:

    Program received signal SIGABRT, Aborted.

    [Switching to Thread 0x344314e8 (LWP 1443)]

    0x2a3dd658 in raise () from /lib/libc.so.6

    (gdb) bt

    #0 0x2a3dd658 in raise () from /lib/libc.so.6

    #1 0x2a3dea2c in abort () from /lib/libc.so.6

    Backtrace stopped: frame did not save the PC

  3. I tried to use hexdump to dump the address in the program on which the free() fails with this: 开发者_StackOverflow社区hexdump -C ~/main.out -s 0x6d0000 -n 2000

    It gives me this (fault is free(0x6d0065)):

    006d0060 e8 32 06 00 2c e6 02 5f 5a 4e 4b 53 73 34 66 69 |.2..,.._ZNKSs4fi| 006d0070 6e 64 45 63 6a 00 ab 00 00 00 01 3e 58 00 00 1d |ndEcj......>X...|

    Which looks like a std::string function, which is quite strange... I figured this might be a bad use of hexdump, since the memory gets relocated when the program is loaded into memory.

  4. I tried readelf ~/main.out -a | grep 6d0065 to no avail as well (gives no hit)

I'm not a pro on debugging in these conditions; do you have any idea of how I could get what this address means to the program?

Edit:

  • The program runs on an embedded platform (SH4); which is not supported by valgrind (very sadly...).

  • Some more details on what this class does: it uses the CUrl library to retrieve an XML file on the internet, then proceeds to parse it with the pugixml library.


Wild shot in the dark... You have a base class that does not have a virtual destructor, and you are actually instantiating a derived class and storing the result of the new in a pointer to the base. You delete through a base pointer and the undefined behavior in this particular case is causing that effect.

Why do I believe that? The pointer passed to free has not being returned by malloc, as being an odd address it cannot be properly aligned, and malloc provides aligned blocks of memory always. That means that the memory that the pointer returned by the new and the pointer stored have an offset, which indicates that it is probably a base class. If the class had a virtual destructor, then the delete would be able to determine the most derived object type, and in doing so, it would correct the pointer to refer to the address that was allocated with the new. Without the virtual destructor, the delete is being transformed into obj->~MyObject(); free( obj ); without correcting the offset.

Another thing that could cause the same type of problem would be allocating with new[], and deallocating with delete (new[] will allocate extra space to store the number of elements, usually before the returned pointer). Again the problem would be that the pointer that is passed to free by the implementation of delete is not corrected to match the one returned by malloc... But in this particular case, the offset would probably not be an odd number.


Your class MyObject is doing something very, very bad.

And you can always (well, almost always) use the debugger. It's a bit challenging at times in some domains such as real-time programming, but this does not appear to be the case here.

To use the debugger in a meaningful way you code has to be compiled with debugging enabled, the -g option with the GNU compiler.

EDIT: My guess with regard to "doing something very, very bad" is that the class is deleting something that it has no business deleting. For example:

class MyObject {
public:
    MyObject(const char * strarg) : str(strarg) {}
    ~MyObject() { delete str; }
private:
    const char * str;
};


Ok, finally nailed the bug down.

For those interested, it was a structure that wasn't zeroed in the constructor of the class using it.

The schema was like this:

MyObj ---> BaseClass ---> structure member not initialized

Since this structure holds some pointers to strings, the NULL-check was failing and I'd free a bad pointer.

What's sad is that the base class usually would fill up every field of the structure, overriding the default garbage values. Only when I inherited the base class did I encounter the problem.

Looking back at this, I guess I could have found it way faster with a working GDB debugging. I guess I'll have to look into this issue.

0

精彩评论

暂无评论...
验证码 换一张
取 消