I am trying to read chunks of data from a file directly into a struct but the padding is causing too much data to be read and the data to be misaligned.
Do I have to manually read each part into the struct or is there an easier way to do this?
My code:
The structs
typedef unsigned char byte;
struct Header
{
char ID[10];
int version;
};
struct Vertex //cannot rearrange the order of the members
{
byte flags;
float vertex[3];
char bone;
byte referenceCount;
};
How I am reading in the data:
std::ifstream in(path.c_str(), std::ifstream::in | std::ifstream::binary);
Header header;
in.read((char*)&header.ID, sizeof(header.ID));
header.ID[9] = '\0';
in.read((char*)&header.version, sizeof(header.version));
std::cout << header.ID << " " << header.version << "\n";
in.read((char*)&NumVertices, sizeof(NumVertices));
std::cout << NumVertices << "\n";
std::vector<Vertex> Vertices(NumVertices);
for(std::vector<Vertex>::iterator it = Vertices.begin(); it != Vertices.end(); ++it)
{
Vertex& v = (*it);
in.read((char*)&v.flags, sizeof(v.flags));
in.read((char*)&v.vertex, sizeof(v.vertex));
in.read((char*)&v.bone, sizeof(v.bone));
in.read((ch开发者_开发百科ar*)&v.referenceCount, sizeof(v.referenceCount));
}
I tried doing: in.read((char*)&Vertices[0], sizeof(Vertices[0]) * NumVertices);
but this produces incorrect results because of what I believe to be the padding.
Also: at the moment I am using C-style casts, what would be the correct C++ cast to use in this scenario or is a C-style cast okay?
If you're writing the entire structure out in binary, you don't need to read it as if you had stored each variable separately. You would just read in the size of the structure from file into the struct you have defined.
Header header;
in.read((char*)&header, sizeof(Header));
If you're always running on the same architecture or the same machine, you won't need to worry about endian issues as you'll be writing them out the same way your application needs to read them in. If you are creating the file on one architecture and expect it to be portable/usable on another, then you will need to swap bytes accordingly. The way I have done this in the past is to create a swap method of my own. (for example Swap.h)
Swap.h - This is the header you use within you're code
void swap(unsigned char *x, int size);
------------------
SwapIntel.cpp - This is what you would compile and link against when building for Intel
void swap(unsigned char *x, int size)
{
return; // Do nothing assuming this is the format the file was written for Intel (little-endian)
}
------------------
SwapSolaris.cpp - This is what you would compile and link against when building for Solaris
void swap(unsigned char *x, int size)
{
// Byte swapping code here to switch from little-endian to big-endian as the file was written on Intel
// and this file will be the implementation used within the Solaris build of your product
return;
}
No, you don't have to read each field separately. This is called alignment/packing. See http://en.wikipedia.org/wiki/Data_structure_alignment
C-style cast is equivalent to reinterpret_cast
. In this case you use it correctly. You may use a C++-specific syntax, but it is a lot more typing.
You can change padding by explicitly asking your compiler to align structs on 1 byte instead of 4 or whatever its default is. Depending on environment, this can be done in many different ways, sometimes file by file ('compilation unit') or even struct by struct (with pragmas and such) or only on the whole project.
header.ID[10] = '\0';
header.ID[9] is the last element of the array.
If you are using a Microsoft compiler then explore the align pragma. There are also the alignment include files:
#include <pshpack1.h>
// your code here
#include <poppack.h>
GNU gcc has a different system that allows you to add alignment/padding to the structure definition.
If you are reading and writing this file yourself, try Google Protobuf library. It will handle all byteorder, alignment, padding and language interop issues.
http://code.google.com/p/protobuf/
精彩评论