I'm using fwrite()
and fread()
for the first time to write some data structures to disk and I have a couple of questions about best practices and proper ways of doing things.
What I'm writing to disk (so I can later read it back) is all user profiles inserted in a Graph structure. Each graph vertex is of the following type:
typedef struct sUserProfile {
char name[NAME_SZ];
char address[ADDRESS_SZ];
int socialNumber;
char password[PASSWORD_SZ];
HashTable *mailbox;
short msgCount;
} UserProfile;
And this is how I'm currently writing all the profiles to disk:
void ioWriteNetworkState(SocialNetwork *social) {
Vertex *currPtr = social->usersNetwork->vertices;
UserProfile *user;
FILE *fp = fopen("save/profiles.dat", "w");
if(!fp) {
perror("fopen");
exit(EXIT_FAILURE);
}
fwrite(&(social->usersCount), sizeof(int), 1, fp);
while(currPtr) {
user = (UserProfile*)currPtr->value;
fwrite(&(user->socialNumber), sizeof(int), 1, fp);
fwrite(user->name, sizeof(char)*strlen(user->name), 1, fp);
fwrite(user->address, sizeof(char)*strlen(user->address), 1, fp);
开发者_StackOverflow中文版 fwrite(user->password, sizeof(char)*strlen(user->password), 1, fp);
fwrite(&(user->msgCount), sizeof(short), 1, fp);
break;
currPtr = currPtr->next;
}
fclose(fp);
}
Notes:
- The first
fwrite()
you see will write the total user count in the graph so I know how much data I need to read back. - The
break
is there for testing purposes. There's thousands of users and I'm still experimenting with the code.
My questions:
- After reading this I decided to use
fwrite()
on each element instead of writing the whole structure. I also avoid writing the pointer to to the mailbox as I don't need to save that pointer. So, is this the way to go? Multiplefwrite()
's instead of a global one for the whole structure? Isn't that slower? - How do I read back this content? I know I have to use
fread()
but I don't know the size of the strings, cause I usedstrlen()
to write them. I could write the output ofstrlen()
before writing the string, but is there any better way without extra writes?
If your program needs to be at all portable then you should not be writing ints and shorts to disk as blocks of memory: the data will be corrupted when you try to read them in on a computer with a different word size (e.g. 32bit -> 64 bit) or different byte order.
For strings, you can either write the length first, or include a terminator at the end.
The best way is usually to use a text based format. For example, you could write each record as a separate line, with fields separated by a tab or a colon. (As a bonus, you no longer need to write a count of the number of records at the start of the file --- just read in records until you hit end of file.)
Edit: But if this is a class assignment you've been given, you probably don't need to worry about portability. Write '\0'
terminators from the strings to disk to delimit them. Don't worry about efficiency when reading it back in, the slowest bit is the disk access.
Or even fwrite()
out the entire structure and fread()
it all back in. Worried about that pointer? Overwrite it with a safe value when you read it in. Don't worry about the wasted space on disk (unless you've been asked to minimise disk usage).
If you do need to write non-negative ints to disk in a portable binary format, you could do it like this:
- first byte is the number of following bytes
- second byte is the most significant non-zero byte in the int
- ...
- last byte is the least significant byte in the int
So:
- 0 encodes as 00
- 1 encodes as 01 01
- 2 encodes as 01 02
- 255 -> 01 ff
- 256 -> 02 01 00
- 65535 -> 02 ff ff
- 65536 -> 03 01 00 00
- etc
If you need to encode negative numbers as well, you will need to reserve a bit for the sign somewhere.
You're right: as you're doing it now, there's no way to read back the content because you can't tell where one string ends and the next begins.
The advice you cite to avoid using fwrite() for structured data is good, but interpreting that advice to mean that you should fwrite() each element individually may not be the best solution.
I think you should consider using a different format for your file, instead of writing raw values with fwrite(). (For example, your files will not be portable to a machine with different byte order.)
Since it looks like most of your elements are strings & integers, have you considered a text-based format using fprintf() to write and fscanf() to read? One big advantage of a text-based format instead of an application-specific binary format is that you can view it with standard tools (for debugging, etc.)
Also, whatever format you choose, make sure you consider the possibility that you may need to add more fields in the future. At a minimum, that means you should include a version number in some kind of header, either for the file itself or for each individual entry. Even better, tag the individual fields (to allow for optional attributes), for example:
name: user1
address: 1600 pennsylvania ave
favorite color: blue
name: user2
address: 1 infinite loop
last login: 12th of never
It is slower. Calling a function x times is slower than calling it once where x>1. If performance turns out to be a concern, you can use
fwrite
/fread
with sizeof(structure) for regular use and write a portable serialized version to import/export. But check if it really is a problem first. Most formats don't use binary data anymore, so you can tell that at leastfread
performance it's not their main concern.No there isn't. the alternative is doing a
fgetc(3)
based strlen.
精彩评论