Context :
char buffer[99]; int* ptr_int=(int*)(buffer+n);
Then i do some time consuming operations on *ptr_int and measure execution time using windows.h/QueryPerformanceCounter.
开发者_JAVA技巧Confusion: For values of n : 0 to 4, execution time is about 12 secs For values of n : 5,6,7 execution time is about 20 secs For value of n : 32,33 execution time is again about 12 secs.
This may be due to alignment but can someone please explain how exactly?
Pentium dual core T2410/winxp/g++3.4.2(mingw-special)
Edit
I am not trying to avoid the alignment issue by using better approaches, instead I am trying to find why I suddenly have alignment problem with int* ptr_int=(int*)(buffer+5);
No issue with:int* ptr_int=(int*)(buffer+3);
OR int* ptr_int=(int*)(buffer+33);
On modern CPUs, data needs to aligned properly, or else there'll be hell to pay. A 32-bit integer needs to be aligned by 4 bytes, or else the CPU will internally need to read two integers and shift things around to fit. Some CPUs will actually crash if you try read an unaligned integer.
Likewise, a 128-bit __vector4 needs to be aligned by 16 bytes, etc.
By the way, there are other factors that come into play, like the data cache line, so the first time you access a new cache line, there'll be a big penalty - subsequent reads will be much faster.
Very likely as the others said this is an alignment issue. Now there are a few ways you can fix it, and test it.
The easiest is to use malloc, or new, to allocate the buffer on the heap. Malloc guarantees that the returned pointer will be suitable for alignment of the largest native data type. On Intel 64-bit chips that will mean it is aligned to a 128-bit double.
char * buffer = malloc( n * sizeof(int) );
int * at = (int*)buffer + ndx;
It looks like your original +n
was also wrong. The way you did it was to offset the char ptr, so by 1 byte, rather than the int ptr, by 4 bytes. This could also explain the slowdown if you were copying the integers. This is because you may have inadvertently been using integers that overlap the same memory location.
If you must use stack allocation, and it does come up, you can also align that. There is a boost function that does this I believe as well
char cbuffer[1024+sizeof(int)];
int * ibuffer = (cbuffer / sizeof(int) + 1) * sizeof(int);
Then ibuffer
will be integer aligned. The actual pointer value may not be the same as for cbuffer, but in some cases it may be (depends on the stack at the time of the call). The second line is simple math on the pointer to ensure it is a multiple of sizeof(int), which means it is int aligned.
new: Can somebody confirm if new char[x]
also guarantees alignment as does malloc
?
See Ulrich Drepper's excellent paper "What Every Programmer Should Know About Memory" for a complete explanation of this and other memory issues, together with example benchmarks you can run yourself.
精彩评论