开发者

what should be x in __attribute__ ((aligned(x)))

开发者 https://www.devze.com 2023-03-26 11:43 出处:网络
I get it that variable alignment is needed for efficiency. What I do not get is how to determine the proper size of the alignment. From my understanding aligned value should always be set to the word

I get it that variable alignment is needed for efficiency. What I do not get is how to determine the proper size of the alignment. From my understanding aligned value should always be set to the word size of the processor (i.e 4 bytes on a 32 bit machine and 8 bytes on 64 bit machine.) regardless on the data type, so that processor reads are aligned with the address of the variable.

For example why would someone do some thing like this. I get it that this is just a problem in some开发者_运维知识库 programming book. Does it make sense to use different alignment values like the one in the link?


Basic rule: data types should be native-aligned. Alignment should be the same as the bytes needed to store the type (rounded up to the power of 2), e.g.:

type   size   align (bytes)
char     1       1
short    2       2
int      4       4
float    4       4
int64_t  8       8
double   8       8
long double (x87, 80 bit)  10  16
_float128  16    16
int128_t   16    16

Some architectures, e.g. SPARC, prohibit data access if it is not aligned by 4 bytes, so a single char will have 4-byte alignment, and even on architectures that permit such behavior, it can be faster to access data stored with such alignment; thus, local variables on the stack and struct fields often have padding to achieve this if you have a mixture of differently-sized types, though that behavior can be altered if so desired.

The cache is faster with an alignment of more than just word size (not 32 and 64 bit, but at cache line size, e.g. 16 bytes or 32 bytes or 64 bytes).

Some wider instructions, like SSE2 (128bit wide) or double float (64bit wide) are faster (or will sometimes not work) for alignment of native width (if you need to load 128bit data, you should align it to 128 bits).

DMA and memory paging need even more alignment, but that is usually obtained by pointer manipulation.

OpenCL (GPGPU) sometimes needs huge alignment due very wide DDR buses and GPU core memory access limits: http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/attributes-variables.html

/* a has alignment of 128 */
 __attribute__((aligned(128))) struct A {int i;} a;
0

精彩评论

暂无评论...
验证码 换一张
取 消