Questions on usages of sizeof_问答_开发者_运维开发者技术经验分享

Question 1

I have a struct like,

struct foo
{
    int a;
    char c;
};

When I say sizeof(foo), I am getting 8 on my machine. As per my understanding, 4 bytes for int, 1 byte for char and 3 bytes for padding. Is that correct? Given a struct like the above, how wi开发者_StackOverflow社区ll I find out how many bytes will be added as padding?

Question 2

I am aware that sizeof can be used to calculate the size of an array. Mostly I have seen the usage like (foos is an array of foo)

sizeof(foos)/sizeof(*foos)

But I found that the following will also give same result.

sizeof(foos) / sizeof(foo)

Is there any difference in these two? Which one is preferred?

Question 3

Consider the following statement.

foo foos[] = {10,20,30};

When I do sizeof(foos) / sizeof(*foos), it gives 2. But the array has 3 elements. If I change the statement to

foo foos[] = {{10},{20},{30}};

it gives correct result 3. Why is this happening?

Any thoughts..

Answer 1

Yes - your calculation is correct. On your machine, sizeof(int) == 4, and int must be 4-byte aligned.

You can find out about the padding by manually adding the sizes of the base elements and subtracting that from the size reported by sizeof(). You can predict the padding if you know the alignment requirements on your machine. Note that some machines are quite fussy and give SIGBUS errors when you access misaligned data; others are more lax but slow you down when you access misaligned data (and they might support '#pragma packed' or something similar). Often, a basic type has a size that is a power of 2 (1, 2, 4, 8, 16) and an n-byte type like that must be n-byte aligned. Also, remember that structures have to be padded so that an array of structures will leave all elements properly aligned. That means the structure will normally be padded up to a multiple of the size of the most stringently aligned member in the structure.

Answer 2

Generally, a variant on the first is better; it remains correct when you change the base type of the array from a 'foo' to a 'foobar'. The macro I customarily use is:

#define DIM(x) (sizeof(x)/sizeof(*(x)))

Other people have other names for the same basic operation - and you can put the name I use down to pollution from the dim and distant past and some use of BASIC.

As usual, there are caveats. Most notably, you can't apply this meaningfully to array arguments to a function or to a dynamically allocated array (using malloc() et al or new[]); you have apply to the actual definition of an array. Normally the value is a compile-time constant. Under C99, it could be evaluated at runtime if the array is a VLA - variable-length array.

Answer 3

Because of the way initialization works when you don't have enough braces. Your 'foo' structure must have two elements. The 10 and the 20 are allocated to the first row; the 30 and an implicit 0 are supplied to the second row. Hence the size is two. When you supply the sub-braces, then there are 3 elements in the array, the first components of which have the values 10, 20, 30 and the second components all have zeroes.

The padding is usually related to the size of the registers on the hist CPU - in your case, you've got a 32-bit CPU, so the "natural" size of an int is 4 bytes. It is slower and more difficult for the CPU to access quantities of memory smaller than this size, so it is generally preferable to align values onto 4-byte boundaries. The struct thus comes out as a multiple of 4 bytes in size. Most compilers will allow you to modify the amount of padding used (e.g. with "#pragma"s), but this should only be used where the memory footprint of the struct is absolutely critical.
"*foos" references the first entry in the foos array. "foo" references (a single instance of) the type. So they are essentially the same. I would use sizeof(type) or sizeof(array[0]) myself, as *array is easier to mis-read.
In your first example, you are not intialising the array entries correctly. Your struct has 2 members so you must use { a, b } to initialise each member of the array. So you need the form { {a, b}, {a, b}, {a, b} } to correctly initialise the entries.

To find out how much padding you have, simply add up the sizeof() each element of the structure, and subtract this sum from the sizeof() the whole structure.

You can use offsetof() to find out exactly where the padding is, in more complex structs. This may help you to fill holes by rearranging elements, reducing the size of the struct as a whole.

It is good practice to explicitly align structure elements, by manually inserting padding elements so that every element is guaranteed to be "naturally aligned". You can reuse these padding elements for useful data in the future. If you ever write a library that will require a stable ABI, this will be a required technique.