How do you 'de-serialize' a derived class from serialized data?_问答_开发者

How do you 'de-serialize' a derived class from serialized data? Or maybe I should say, is there a better way to 'de-serialize' data into derived classes?

For example, suppose you had a pure virtual base class (B) that is inherited by three other classes, X, Y and Z. Moreover, we have a method, serialize(), that will translate X:B, Y:B and Z:B into serialized data.

This way it can be zapped across a socket, a named pipe, etc. to a remote process.

The problem I have is, how do we create an appropriate object from the serialized data?

The only solution I can come up with is including an identifier in the serialized data that indicates the final derived obj开发者_如何转开发ect type. Where the receiver, first parses the derived type field from the serialized data, and then uses a switch statement (or some sort of logic like that) to invoke the appropriate constructor.

For example:

B deserialize( serial_data )
{
    parse the derived type from the serial_data

    switch (derived type)
        case X
            return X(serial_data)
        case Y
            return Y(serial_data)
        case Z
            return Z(serial_data)
}

So after learning the derived object type we invoke the appropriate derived type constructor.

However, this feels awkward and cumbersome. I'm hoping there is a more eloquent way of doing this. Is there?

In fact, it's a more general issue than serialization called Virtual Constructor.

The traditional approach is to a Factory, which based on an ID returns the right derived type. There are two solutions:

the switch method as you noticed, though you need to allocate on the heap
the prototype method

The prototype method goes like so:

// Cloneability
class Base
{
public:
  virtual Base* clone() const = 0;
};

class Derived: public Base
{
public:
  virtual Derived* clone() const { return new Derived(*this); }
};

// Factory
class Factory
{
public:
  Base* get(std::string const& id) const;
  void set(std::string const& id, Base* exemplar);

private:
  typedef std::map < std::string, Base* > exemplars_type;
  exemplars_type mExemplars;
};

It is somewhat traditional to make the Factory a singleton, but it's another matter entirely.

For deserialization proper, it's easier if you have a virtual method deserialize to call on the object.

EDIT: How does the Factory work ?

In C++ you can't create a type you don't know about. The idea above is therefore that the task of building a Derived object is given to the Derived class, by way of the clone method.

Next comes the Factory. We are going to use a map which will associate a "tag" (for example "Derived") to an instance of an object (say Derived here).

Factory factory;
Derived derived;
factory.set("Derived", &derived);

Now, when we want to create an object which type we don't know at compile time (because the type is decided on the fly), we pass a tag to the factory and ask for an object in return.

std::unique_ptr<Base> base = factory.get("Derived");

Under the cover, the Factory will find the Base* associated to the "Derived" tag and invoke the clone method of the object. This will actually (here) create an object of runtime-type Derived.

We can verify this by using the typeid operator:

assert( typeid(base) == typeid(Derived) );

inmemory:
--------
type1 {
  chartype a;
  inttype b;
};
serialize(new type1());

serialized(ignore { and ,):
---------------------------
type1id,len{chartypeid,adata,inttypeid,bdata}

i guess, in an ideal serialization protocol, every non-primitive type need to be prefixed with typeid,len. Even if you serialize a single type that is not derived from anything, you would add a type id, because the other end has to know what type its getting (regardless of inheritance structure). So you have to mention derived class ids in the serialization, because logically they are different types. Correct me if i am wrong.