I am building PB messages containing 4 int32 fields. I wanted to add two more boolean fields to this message. I noticed that the size increased by 4 bytes for 2 boolean fi开发者_StackOverflow社区elds. Do the booleans occupy as large as 2 bytes per field. Can't they be stored in a more compact form ?
Each field header takes 1 byte (for low field numbers; more for large field numbers), and each bool takes 1 byte. There is no sub-byte packing in protobuf; things are rounded up to the nearest byte.
1 option would be to store an unsigned integer map - I.e. Treat each bool as a power of two and add them up (bitwise). Then you need one byte header and one byte data, so 2 bytes for both. You will, however, have to pack/unpack the integer yourself. Assuming you have a low field number, up to 7 bools is then 1 byte data plus 1 byte header. Since "varint" uses base-128 encoding (the msb is continuation), an 8th bool (up to 14 bools) would take 2 bytes data plus 1 byte header.
It's not that surprising. It probably uses one byte as a type flag for the following value, and then stores the bit in a byte. You could pack many bits into an integer type (8 in a char, 16 in a short, 32 in an int) for more efficient transfer. Whether the bandwidth saving is worth the trouble of writing and runningthe packing code will depend on your application.
There's always a tradeoff on space vs. access time.
There are any number of ways your JVM's designer could have chosen to make that tradeoff. It may be that he chose to insist that structures always occupy a multiple of 4 bytes, so every structure would be longword-aligned. It may be that he decided that booleans should occupy two bytes apiece, as that made addressing simpler. It may be that he allocated one bit apiece for the two booleans, in one byte, and then padded it out to a two-byte boundary.
Without knowing exactly how the structure was laid out, and how the access to the booleans works, there's not much you can do.
精彩评论