In the "Advanced Programming in the Unix Environment" book there's a part (ch 8.14, page 251) in which the author shows us the definition of the "acct" struct (used to store accounting records info). He then shows a program in which he reads the accounting data from a file into the struct (the key part of which is):
fread (&acdata, sizeof(acdata), 1, fp)
The trouble I'm havi开发者_C百科ng is that I've heard that C compilers will sometimes rearrange the elements of a struct in memory in order to better utilize space (due to alignment issues). So, if this code is just taking all of the content of the file and sticking it into acdata (and the contents of the file are arranged to match the ordering specified in the struct definition) if some of the elements of struct have been moved, then if I refer to them in code, I may not be getting what I expected (since the data in the file did not get rearranged the way the struct did in memory).
What am I missing (because from what I'm getting this doesn't seem reliable)?
Thanks for your help (my apologies if I've done something wrong procedurally - this is my first time posting)
Worry!
You are right to worry about this issue and pay attention to it. It's a vexing problem, and often happens when you carry your source to another machine, with a different -- even slightly different -- architecture, and perhaps with a different OS or maybe a different compiler; compile your program there; and expect your structs to remain intact over fwrite( )
and fread( )
. Or when you add a 1-byte variable to your struct, recompile, and send out binaries to all your friends. Your program doesn't work on their machines anymore, for some mysterious reason.
Sometimes it works (by accident) and you never notice the problem; sometimes it doesn't work and you pull your hair out for a few days.
The isssue has nothing to do with rearrangement of struct members. Compilers don't do that. It has nothing to do with optimization, either.
The issue is byte alignment, and the Wikipedia article mentioned below tells you how to fix up your structs so they'll always be correctly aligned. It's always a good idea to pay attention to byte alignment. Otherwise your program isn't portable. And, worse, the program you carefully compiled on your whiz-bang x86-64 and distributed to all of your customers all of a sudden won't run on their 32-bit machines.
Just as important: be mindful of the lengths and alignments of the struct members, too.
There's a nice Wikipedia article that explains the details. It's a very worthwhile read.
I would be wary of a compiler-specific pragma that does the job, but just for that compiler. If you put a pragma in your code, then your program isn't C anymore.
The layout (padding and alignment, but not order) of the structure may change if you compile your code on a different compiler, or a later version of the compiler, or even with different compile-time options.
It won't change from run to run of the same compiled program - that would be a nightmare scenario :-)
So, provided the same program (or technically, any program which has the same structure layout encoded into it at compile time) is the one doing the reading, this will work just fine.
The relevant sections of the C99 standard are:
6.2.6.1/1: The representations of all types are unspecified except as stated in this subclause.
6.2.6.1/6 (the only mention of structures in that subclause): When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values. The value of a structure or union object is never a trap representation, even though the value of a member of the structure or union object may be a trap representation.
That's the only mention of structure padding in that subclause. In other words, it's up to the implementation and they don't even need to document it (unspecified as opposed to implementation-defined, which would require documenting).
6.7.2.1/13: ... There may be unnamed padding within a structure object, but not at its beginning.
6.7.2.1/15: There may be unnamed padding at the end of a structure or union.
If you were to create version 1.1 of your program and it uses a different structure layout (new compiler, different compiler options, #pragma pack
, etc), it would very quickly be evident that you had a problem during your unit tests (which should include loading in a file from the previous version).
In that case, you could include some 'intelligence' in your 1.1 program which could recognise an earlier file layout and transform the data as it comes in. That's why good file formats will often have a version indicator (for the file layout version, not the program version) as the first item in that file.
For example, quite a few of my applications use an application identifier along with a 16-bit integer at the front of the file to indicate what application and version it is and the file loader part of the program can handle at least the current and previous versions (and often every version ever created).
The program version and file layout version are separate things - they can drift if, for example, you release ten versions of your program without needing to update the file layout.
Yes
Your program will be stable.
Your question has touched off a bonfire of portability recommendations that you didn't actually ask for. The question you seemed to be asking is "is this code pattern and my program stable?". And the answer to that is yes.
You structure will not be reordered. C99 specifically prohibits rearranging the structure members.1
Also, the layout and alignment do not depend on optimization level. If they did, all programs would have to be entirely built with the same optimization level, as well as all library routines, the kernel, all kernel interfaces, etc.
Users would also have to track, forever, the optimization level of every one of those interfaces listed above that ever had been compiled as part of the system.
The memory alignment rules are really a kind of hidden ABI. They can't change without adding very specialized and by definition rarely-used compiler flags. They tend to work just fine over different compilers. (Otherwise, every element of a system identified above would ALSO have to be compiled by the same compiler, or be useless. Every compiler that supports a given system uses the exact same alignment rules. Nothing would work, otherwise.) The compiler flags that change alignment policies are usually intended to be built into the compiler configuration for a given OS.
Now, your binary file layout, while perfectly reasonable, is a bit old-school. It has certain drawbacks. While none of these are show-stoppers and none are generally worth rewriting an app, they include:
- it's hard to debug binary files
- they do lock in a single byte order and a single alignment policy. In the (sadly, increasingly unlikely) case where you need to port to a new architecture, you might end up needing to unpack the record with memcpy(3). Not the end of the world.
- they aren't structured. Things like YAML and, ahem, even XML are sort of self-parsing, so it becomes a lot easier to read in a file, and certain types of file manipulations can be done with tools. Even more important, the file format itself becomes more flexible. Your ability to take advantage of the auto-parsed-object is limited, however, in C and C++.
As I understand Paxdiablo's request, he would like me to agree that there exist compiler options and pragmas that, if used, will alter the alignment rules. That's true. Obviously these options are used only for specific reasons.
1. C99 6.7.2.1(13) Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared.
The struct is written to the file based on how it is in memory. The ordering will be the same. Mixing compilers between write and read might be an issue however.
精彩评论