开发者

Network Data Packing

开发者 https://www.devze.com 2023-04-04 04:36 出处:网络
I was searching for a way to efficiently pack my data in order to send them over a network. I found a topic which sugge开发者_JAVA百科sted a way : http://www.sdltutorials.com/cpp-tip-packing-data

I was searching for a way to efficiently pack my data in order to send them over a network. I found a topic which sugge开发者_JAVA百科sted a way : http://www.sdltutorials.com/cpp-tip-packing-data

And I've also seen it being used in commercial applications. So I decided to give it a try, but the results weren't what I expected.

  1. First of all , the whole point of "packing" your data is to save bytes. But I don't think that the algorithm mentioned above is saving bytes at all. Because , without packing ... The server would send 4 bytes (Data) , after the packing the server sends a character array , 4 bytes long ... So it's pointless.

  2. Aside from that , why would someone add 0xFF , it doesn't do anything at all.

The code snippet found in the tutorial mentioned above:

    unsigned char Buffer[3];
    unsigned int Data = 1024;
    unsigned int UpackedData;
    Buffer[0] = (Data >> 24) & 0xFF;
    Buffer[1] = (Data >> 12) & 0xFF;
    Buffer[2] = (Data >> 8) & 0xFF;
    Buffer[3] = (Data ) & 0xFF;
    UnpackedData = (Buffer[0] << 24) | (Buffer[1] << 12) | (Buffer[2] << 8) | (Buffer[3] & 0xFF);

Result: 0040 // 4 bytes long character 1024 // 4 bytes long


The & 0xFF is to make sure it's between 0 and 255.

I wouldn't place too much credence in that posting; aside from your objection, the code contains an obvious mistake. Buffer is only 3 elements long, but the code stores data in 4 elements.


For integers a simple method I found often useful is BER encoding. Basically for an unsigned integer you write 7 bits for each byte, using the 8th bit to mark if another byte is needed

void berPack(unsigned x, std::vector<unsigned char>& out)
{
    while (x >= 128)
    {
        out.push_back(128 + (x & 127)); // write 7 bits, 8th=1 -> more needed
        x >>= 7;
    }
    out.push_back(x); // Write last bits (8th=0 -> this ends the number)
}

for a signed integer you encode the sign in the least significant bit and the use the same encoding as before

void berPack(int x, std::vector<unsigned char>& out)
{
    if (x < 0) berPack((unsigned(-x) << 1) + 1, out);
          else berPack((unsigned(x) << 1), out);
}

With this approach small numbers will use less space. Another advantage is that this encoding is already architecture-neutral (i.e. data will be understood correctly independently on the endian-ness of the system) and that the same format can handle different integer sizes and you can send data from a 32 bit system to a 64 bit system without problems (assuming of course that the values themselves are not overflowing).

The price to pay is that for example unsigned values from 268435456 (1 << 28) to 4294967295 ((1 << 32) - 1) will require 5 bytes instead of 4 bytes of standard fixed 4-bytes packing.


  1. Another reason for packing is to enforce a consistent structure, so that data written by one machine can be reliably read by another.

  2. It's not "adding"; it's performing a bitwise-AND in order to mask out the LSB (least-significant byte). But it doesn't look necessary here.

0

精彩评论

暂无评论...
验证码 换一张
取 消