开发者

Why do I get a C malloc assertion failure?

开发者 https://www.devze.com 2023-01-02 18:03 出处:网络
I am implementing a divide and conquer polynomial algorithm so I can benchmar开发者_如何学运维k it against an OpenCL implementation, but I can\'t get malloc to work. When I run the program, it allocat

I am implementing a divide and conquer polynomial algorithm so I can benchmar开发者_如何学运维k it against an OpenCL implementation, but I can't get malloc to work. When I run the program, it allocates a bunch of stuff, checks some things, then sends the size/2 to the algorithm. Then when I hit the malloc line again it spits out this:

malloc.c:3096: sYSMALLOc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed.
Aborted

The line in question is:

int *mult(int size, int *a, int *b) {
    int *out,i, j, *tmp1, *tmp2, *tmp3, *tmpa1, *tmpa2, *tmpb1, *tmpb2,d, *res1, *res2;
    fprintf(stdout, "size: %d\n", size);

    out = (int *)malloc(sizeof(int) * size * 2);
}

I checked size with a fprintf, and it is a positive integer (usually 50 at that point). I tried calling malloc with a plain number as well and I still get the error. I'm just stumped at what's going on, and nothing from Google I have found so far is helpful.

Any ideas what's going on? I'm trying to figure out how to compile a newer GCC in case it's a compiler error, but I really doubt it.


99.9% likely that you have corrupted memory (over- or under-flowed a buffer, wrote to a pointer after it was freed, called free twice on the same pointer, etc.)

Run your code under Valgrind to see where your program did something incorrect.


To give you a better understanding of why this happens, I'd like to expand upon @r-samuel-klatchko's answer a bit.

When you call malloc, what is really happening is a bit more complicated than just giving you a chunk of memory to play with. Under the hood, malloc also keeps some housekeeping information about the memory it has given you (most importantly, its size), so that when you call free, it knows things like how much memory to free. This information is commonly kept right before the memory location returned to you by malloc. More exhaustive information can be found on the internet™, but the (very) basic idea is something like this:

+------+-------------------------------------------------+
+ size |                  malloc'd memory                +
+------+-------------------------------------------------+
       ^-- location in pointer returned by malloc

Building on this (and simplifying things greatly), when you call malloc, it needs to get a pointer to the next part of memory that is available. One very simple way of doing this is to look at the previous bit of memory it gave away, and move size bytes further down (or up) in memory. With this implementation, you end up with your memory looking something like this after allocating p1, p2 and p3:

+------+----------------+------+--------------------+------+----------+
+ size |                | size |                    | size |          +
+------+----------------+------+--------------------+------+----------+
       ^- p1                   ^- p2                       ^- p3

So, what is causing your error?

Well, imagine that your code erroneously writes past the amount of memory you've allocated (either because you allocated less than you needed as was your problem or because you're using the wrong boundary conditions somewhere in your code). Say your code writes so much data to p2 that it starts overwriting what is in p3's size field. When you now next call malloc, it will look at the last memory location it returned, look at its size field, move to p3 + size and then start allocating memory from there. Since your code has overwritten size, however, this memory location is no longer after the previously allocated memory.

Needless to say, this can wreck havoc! The implementors of malloc have therefore put in a number of "assertions", or checks, that try to do a bunch of sanity checking to catch this (and other issues) if they are about to happen. In your particular case, these assertions are violated, and thus malloc aborts, telling you that your code was about to do something it really shouldn't be doing.

As previously stated, this is a gross oversimplification, but it is sufficient to illustrate the point. The glibc implementation of malloc is more than 5k lines, and there have been substantial amounts of research into how to build good dynamic memory allocation mechanisms, so covering it all in a SO answer is not possible. Hopefully this has given you a bit of a view of what is really causing the problem though!


My alternative solution to using Valgrind:

I'm very happy because I just helped my friend debug a program. His program had this exact problem (malloc() causing abort), with the same error message from GDB.

I compiled his program using Address Sanitizer with

gcc -Wall -g3 -fsanitize=address -o new new.c
              ^^^^^^^^^^^^^^^^^^

And then ran gdb new. When the program gets terminated by SIGABRT caused in a subsequent malloc(), a whole lot of useful information is printed:

=================================================================
==407==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6060000000b4 at pc 0x7ffffe49ed1a bp 0x7ffffffedc20 sp 0x7ffffffed3c8
WRITE of size 104 at 0x6060000000b4 thread T0
    #0 0x7ffffe49ed19  (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5ed19)
    #1 0x8001dab in CreatHT2 /home/wsl/Desktop/hash/new.c:59
    #2 0x80031cf in main /home/wsl/Desktop/hash/new.c:209
    #3 0x7ffffe061b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
    #4 0x8001679 in _start (/mnt/d/Desktop/hash/new+0x1679)

0x6060000000b4 is located 0 bytes to the right of 52-byte region [0x606000000080,0x6060000000b4)
allocated by thread T0 here:
    #0 0x7ffffe51eb50 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb50)
    #1 0x8001d56 in CreatHT2 /home/wsl/Desktop/hash/new.c:55
    #2 0x80031cf in main /home/wsl/Desktop/hash/new.c:209
    #3 0x7ffffe061b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)

Let's take a look at the output, especially the stack trace:

The first part says there's a invalid write operation at new.c:59. That line reads

memset(len,0,sizeof(int*)*p);
             ^^^^^^^^^^^^

The second part says the memory that the bad write happened on is created at new.c:55. That line reads

if(!(len=(int*)malloc(sizeof(int)*p))){
                      ^^^^^^^^^^^

That's it. It only took me less than half a minute to locate the bug that confused my friend for a few hours. He managed to locate the failure, but it's a subsequent malloc() call that failed, without being able to spot this error in previous code.

Sum up: Try the -fsanitize=address of GCC or Clang. It can be very helpful when debugging memory issues.


You are probably overrunning beyond the allocated mem somewhere. then the underlying sw doesn't pick up on it until you call malloc

There may be a guard value clobbered that is being caught by malloc.

edit...added this for bounds checking help

http://www.lrde.epita.fr/~akim/ccmp/doc/bounds-checking.html


I got the following message, similar to your one:


    program: malloc.c:2372: sysmalloc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 *(sizeof(size_t))) - 1)) & ~((2 *(sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long) old_end & pagemask) == 0)' failed.

Made a mistake some method call before, when using malloc. Erroneously overwrote the multiplication sign '*' with a '+', when updating the factor after sizeof()-operator on adding a field to unsigned char array.

Here is the code responsible for the error in my case:


    UCHAR* b=(UCHAR*)malloc(sizeof(UCHAR)+5);
    b[INTBITS]=(some calculation);
    b[BUFSPC]=(some calculation);
    b[BUFOVR]=(some calculation);
    b[BUFMEM]=(some calculation);
    b[MATCHBITS]=(some calculation);

In another method later, I used malloc again and it produced the error message shown above. The call was (simple enough):


    UCHAR* b=(UCHAR*)malloc(sizeof(UCHAR)*50);

Think using the '+'-sign on the 1st call, which lead to mis-calculus in combination with immediate initialization of the array after (overwriting memory that was not allocated to the array), brought some confusion to malloc's memory map. Therefore the 2nd call went wrong.


We got this error because we forgot to multiply by sizeof(int). Note the argument to malloc(..) is a number of bytes, not number of machine words or whatever.


i got the same problem, i used malloc over n over again in a loop for adding new char *string data. i faced the same problem, but after releasing the allocated memory void free() problem were sorted


I was porting one application from Visual C to gcc over Linux and I had the same problem with

malloc.c:3096: sYSMALLOc: Assertion using gcc on UBUNTU 11.

I moved the same code to a Suse distribution (on other computer ) and I don't have any problem.

I suspect that the problems are not in our programs but in the own libc.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号