开发者

Can I cast a char* buffer to an object-type?

开发者 https://www.devze.com 2023-02-03 07:42 出处:网络
I ask this question out of curiosity rather than difficulty, as I always learn from you, even on unrelated topics.

I ask this question out of curiosity rather than difficulty, as I always learn from you, even on unrelated topics.

So, consider the following method, written in C++ and linked with g++. This method works fine, as everything is initialized to the correct size.

extern "C" 
  {
    void retrieveObject( int id, char * buffer )
  开发者_如何学Go    {
        Object::Object obj;

        extractObject( id, obj );
        memcpy( buffer, &obj, sizeof(obj) );
      }
  }

// Prototype of extractObject
const bool extractObject( const int& id, Object::Object& obj ) const;

Now, I would like to avoid declaration of a local Object and use of memcpy.

I tried to replace retrieveObject with something like :

void retrieveObject( int id, char * buffer )
  {
    // Also tried dynamic_cast and C-Style cast
    extractObject( id, *(reinterpret_cast<Object::Object *>(buffer)) );
  }

It compiles and links successfully, but crashes right away. Considering that my buffer is large enough to hold an Object, does C++ need to call the constructor to "shape" the memory ? Is there another way to replace local variable and memcpy ?

I hope I was clear enough for you to answer, thank you in advance.


In your first effort...

void retrieveObject( int id, char * buffer )
{
     Object::Object obj;
     extractObject( id, obj );
     memcpy( buffer, &obj, sizeof(obj) );
} 

...you still had the compiler create the local variable obj, which guarantees correct alignment. In the second effort...

void retrieveObject( int id, char * buffer )
{
     extractObject( id, *(reinterpret_cast<Object::Object *>(buffer)) );
} 

...you're promising the compiler the buffer points to a byte that's aligned appropriately for an Object::Object. But will it be? Probably not, given your run-time crash. Generally, char*s can start on any given byte, where-as more complex objects are often aligned to the word size or with the largest alignment needed by their data members. Reading/writing ints, doubles, pointers etc. inside Object::Object may only work when the memory is properly aligned - it depends a bit on your CPU etc., but on UNIX/Linux, misalignment could generate e.g. a SIGBUS or SIGSEGV signal.

To explain this, let's consider a simple CPU/memory architecture. Say the memory allows, in any given operation, 4 bytes (a 32-bit architecture) to be read from addresses 0-3, 4-7, or 8-11 etc, but you can't read 4-byte chucks at addresses 1-4, 2-5, 3-6, 5-8.... Sounds strange, but that's actually quite a common limitation for memory, so just accept it and consider the consequences. If we want to read a 4-byte number in memory - if it's at one of those multiple-of-4 addresses we can get it in one memory read, otherwise we have to read twice: from one 4-byte area containing part of the data, then the other 4-byte area containing the rest, then throwing away the bits we don't want and reassembling the rest in the proper places to get the 32-bit value into the CPU register/memory. That's too slow, so languages typically take care to put values we want where the memory can access them in one operation. Even the CPUs are designed with this expectation, as they often have instructions that operate on values in memory directly, without explicitly loading them into registers (i.e. that's an implementation detail beneath even the level of assembly/machine code). Code that asks the CPU to operate on data that's not aligned like this typically results in the CPU generating an interrupt, which the OS might manifest as a signal.

That said, the other caveats about the safety of using this on non-POD data are also valid.


What you are doing is effectively serializing Object and will work fine if and only if all the data in Object is stored contiguously. For simple object this will work fine but the minute there are object that contain pointers to other objects, this stops working.

In C++ it is extremely common for objects to contain other objects. the std::string is a case in point. The string class is a container that references a reference counter object stored elsewhere. So unless you are sure the object is a simple contiguous object, don't do this.


You should take a look at boost.serialisation or boost::message_queues. C++ objects contain more then data (virtual tables) that are run time specific.

You should also put in consideration to add a version information about your objects while transferring them between modules.


Find out why and where it crashes, use a debugger. The code looks ok enough.

If you want to avoid the intermediate Object instance then simply avoid it. Make extractObject() return a pointer to Object and use this pointer to memcpy() its contents to the buffer.

However beware, as the other have said, if you then just reinterpret_cast<> the buffer back to Object things might break if the Object is not simple enough.


Well this may have many problems - first of all, if you use a local object, you cannot just construct it, and then write the memory of some other instance over it (that would work for POD types only, as they do not need the destructor to be called), otherwise you may very well get a nasty memory leak.

But that is not the main issue - the solution you had provided may, or may not work, based on the type of the object used. It will work for the simple POD types, it may even work for more complex classes (provided you will correctly handle constructors/destructors calls), but it will break at the moment some other part of the program expects the object to be at it's original location - let's say, you have a class, that has 2 member variables:

struct A {
   int i;
   int * pi;
}

where the 'pi' will always point to the 'i' member - if you "memcpy" that object to some other location, it will easily break.

0

精彩评论

暂无评论...
验证码 换一张
取 消