c++ figuring out memory layout of members programmatically_问答_开发者

Suppose in one program, I'm given:

class Foo {
  int x;
  double y;
  char z;
};

class Bar {
  Foo f1;
  int t;
  Foo f2;
};

int main() {
  Bar b;
  bar.f1.z = 'h';
  bar.f2.z = 'w';
  ... some crap setting value of b;
  FILE 开发者_JAVA百科*f = fopen("dump", "wb"); // c-style file
  fwrite(&b, sizeof(Bar), 1, f);
}

Suppose in another program, I have:

int main() {
  File *f = fopen("dump", "rb");
  std::string Foo = "int x; double y; char z;";
  std::string Bar = "Foo f1; int t; Foo f2;";

  // now, given this is it possible to read out
  // the value of bar.f1.z and bar.f2.z set earlier?
}

What I'm asking is: given I have the types of a class, can I figure out how C++ lays it out?

You need to research "serialization". There is a library, Boost Serialization, that people have been recommending.

FWIW, I recommend against using fwrite or std::ostream::write on classes, structures and unions. The compiler is allowed to insert padding between members, so there may be garbage written out. Also, pointers don't serialize very well.

To answer your question, in order to determine which structure to load data from, you need some kind of sentinel to indicate the object type. This can be anything from an enum to the name of the object.

Also investigate the Factory design pattern.

I'm not quite sure what you're asking, so I'll take a leap...

If you really need to figure out where the fields are in a struct, use offsetof.

Note the "POD" restriction in the linked page. This is a C macro, included in C++ for compatibility reasons. We are supposed to use member pointers instead these days, though member pointers don't address all the same problems.

"offsetof" basically imagines an instance of your struct at address zero, and then looks at the address of the field you're interested in. This goes horribly wrong if your struct/class uses multiple or virtual inheritance, since finding the field then involves (typically) a check in the virtual table. Since the imaginary instance at address zero doesn't exist, it doesn't have a virtual table pointer, so you probably get some kind of access violation crash.

Some compilers can cope with this, as they have replaced the traditional offsetof macro with an intrinsic that knows the layout of the struct without trying to do the imaginary-instance trickery. Even so, it's best not to rely on this.

For POD structs, though, offsetof is a convenient way to find the offset to a particular field, and a safe one in that it determines the actual offset irrespective of the alignment applied by your platform.

For the sizeof a field, you obviously just use sizeof. That just leaves platform-specific issues - different layout on different platforms etc due to alignment, endianness and so on ;-)

EDIT

Possibly a silly question, but why not fread the data from the file straight into in instance of the struct, doing essentially what you did with the fwrite but in reverse?

You get the same portability issues as above, meaning your code may not be able to read its own files if recompiled using different options, a different compiler or for a different platform. But for a single-platform app this kind of thing works very well.

You can't assume anything about the order of the bytes that represent Bar. If the file goes across system or that program is compiled with different flags then you'll be reading and writing in different orders.

I've seen a way around this, but it may only work for very simple types.

and I quote from a raknet tutorial:

#pragma pack(push, 1)
struct structName
{
  unsigned char typeId; // Your type here
  // Your data here
};
#pragma pack(pop)

Noticed the #pragma pack(push,1) and #pragma pack(pop) ? These force your compiler (in this case VC++), to pack the structure as byte-aligned. Check your compiler documentation to learn more.

You want serialization.

For the example that you give, it looks like you really need some sort of C parser that would parse the strings with your type declarations. Then you'd be able to interpret the bytes that you read from the file in the correct way.

Structs in C are laid out member to member in order of declaration. The compiler may insert padding between members according to platform-specific alignment needs. The size of the variables is also platform-specific.

If you have control over the class you can use member pointers. You definitely can do this. The question is whether or not you should...

class Metadata
{
public:
    virtual int getOffset() = 0;
};

template <typename THost, typename TField>
class TypedMetadata : Metadata
{
private:
    TField (THost::*memberPointer_);

    TypedMetadata(TField (THost::*memberPointer))
    {
        memberPointer_ = memberPointer;
    }

public:
    static Metadata* getInstance(TField (THost::*memberPointer))
    {
        return new TypedMetadata<THost, TField>(memberPointer);
    }

    virtual int getOffset()
    {
        THost* host = 0;

        int result = (int)&(host->*memberPointer_);

        return result;
    }
};

template<typename THost, typename TField>
Metadata* getTypeMetadata(TField (THost::*memberPointer))
{
    return TypedMetadata<THost, TField>::getInstance(memberPointer);
}

class Contained
{
    char foo[47];
};

class Container
{
private:
    int x;
    int y;
    Contained contained;
    char c1;
    char* z;
    char c2;

public:
    static Metadata** getMetadata()
    {
        Metadata** metadata = new Metadata*[6];

        metadata[0] = getTypeMetadata(&Container::x);
        metadata[1] = getTypeMetadata(&Container::y);
        metadata[2] = getTypeMetadata(&Container::contained);
        metadata[3] = getTypeMetadata(&Container::c1);
        metadata[4] = getTypeMetadata(&Container::z);
        metadata[5] = getTypeMetadata(&Container::c2);

        return metadata;
    }
};

int main(array<System::String ^> ^args)
{
    Metadata** metadata = Container::getMetadata();

    std::cout << metadata[0]->getOffset() << std::endl;
    std::cout << metadata[1]->getOffset() << std::endl;
    std::cout << metadata[2]->getOffset() << std::endl;
    std::cout << metadata[3]->getOffset() << std::endl;
    std::cout << metadata[4]->getOffset() << std::endl;
    std::cout << metadata[5]->getOffset() << std::endl;

    return 0;
}