Unions between pointers and data, possible pitfalls?_问答_开发者

Unions between pointers and data, possible pitfalls?

开发者 https://www.devze.com 2023-04-06 18:35 出处：网络

I\'m programming a system which has a massive amount of redundant data that needs to be kept in memory, and accessible with as little latency as possible.(uncompressed, the data is guaranteed to absor

I'm programming a system which has a massive amount of redundant data that needs to be kept in memory, and accessible with as little latency as possible. (uncompressed, the data is guaranteed to absorb 1GB of memory, minimum).

One such method I thought of is creating a container class like the following:

class Chunk{
    public:
        Chunk(){ ... };
        ~Chunk() { /*carefully delete elements according to mask*/ };
        getElement(int index);
        setElement(int index);

    private:
        unsigned char mask;  // on bit == data is not-redundant, array is 8x8, 64 elements
        union{
            Uint32 redundant;  // all 8 elements are this value if m开发者_运维技巧ask bit == 0 
            Uint32 * ptr;      // pointer to 8 allocated elements if mask bit == 1
        }array[8];
};

My question, is that is there any unseen consequences of using a union to shift between a Uint32 primative, and a Uint32* pointer?

This approach should be safe on all C++ implementations.

Note, however, that if you know your platform's memory alignment requirements, you may be able to do better than this. In particular, if you know that memory allocations are aligned to 2 bytes or greater (many platforms use 8 or 16 bytes), you can use the lower bit of the pointer as a flag:

class Chunk {
 //...
  uintptr_t ptr;
};

// In your get function:    
if ( (ptr & 1) == 0 ) {
  return ((uint32_t *)ptr)[index];
} else {
  return *((uint32_t *)(ptr & ~(uintptr_t)0);
}

You can further reduce space usage by using a custom allocation method (with placement new) and placing the pointer immediately after the class, in a single memory allocation (ie, you'll allocate room for Chunk and either the mask or the array, and have ptr point immediately after Chunk). Or, if you know most of your data will have the low bit off, you can use the ptr field directly as the fill-in value:

} else {
  return ptr & ~(uintptr_t)0;
}

If it's the high bit that's usually unused, a bit of bit shifting will work:

} else {
  return ptr >> 1;
}

Note that this approach of tagging pointers is unportable. It is only safe if you can ensure your memory allocations will be properly aligned. On most desktop OSes, this will not be a problem - malloc already ensures some degree of alignment; on Unixes, you can be absolutely sure by using posix_memalign. If you can obtain such a guarentee for your platform, though, this approach can be quite effective.

If space is at a premium you may be wasting memory. It will allocate enough space for the largest element, which in this case could be up to be 64 bits for the pointer.

If you stick to 32-bit architectures you should not have problems with the cast.