I am currently working on a network tool that needs to decode/encode a particular protocol that packs fields into dense bit arrays at arbitrary positions. For example, one part of the protocol uses 3 bytes to represent a number of different fields:
Bit Position(s) Length (In Bits) Type
0 1 bool
1-5 5 int
6-13 8 int
14-22 9 uint
23 1 bool
As you can see, several of the fields span multiple bytes. Many (most) are also shorter than the built-in type that might be used to represent them, such as the first int field which is only 5 bits long. In these cases, the most significant bits of the target type (such as an Int32 or Int16) should be padded with 0 to make up the difference.
My problem is that I am having a difficult time processing this kind of data. Specifically, I am having a hard time figuring out how to efficiently get arbitrary length bit arrays, populate them with the appropriate bits from the source buffer, pad them to match the target type, and convert the padded bit arrays to the target type. In an ideal world, I would be able to take the byte[3] in the example above and call a method like GetInt32(byte[] bytes, int startBit, int length)
.
The closest thing in the wild that I've found is a BitStream class, but it appears to want individual values to line up on byte/word boundaries (and the half-streaming/half-indexed access convention of the class makes it a little confusing).
My own first attempt was to use the BitArray class, but that proved somewhat unwieldy. It's easy enough to stuff all the bits from the buffer into a large BitArray
, transfer only the ones you want from the source BitArray
to a new temporary BitArray
, and then convert that into the target value...but it seems wrong, and very time consuming.
I am now considering a class like the following that references (or creates) a source/target byte[] buffer along with an offset and provides get and set methods for certain target types. The tricky part is that getting/setting values may span multiple bytes.
class BitField
{
private readonly byte[] _bytes;
private readonly int _offset;
public BitField(byte[] bytes)
: this(bytes, 0)
{
}
public BitField(byte[] bytes, int offset)
{
_bytes = bytes;
_offset = offset;
}
public BitField(int size)
: this(new byte[size], 0)
{
}
public bool this[int bit]
{
get { return IsSet(bit); }
set { if (value) Set(bit); else Clear(bit); }
}
public bool IsSet(int bit)
{
return (_bytes[_offset + (bit / 8)] & (1 << (bit % 8))) != 0;
}
publ开发者_JAVA百科ic void Set(int bit)
{
_bytes[_offset + (bit / 8)] |= unchecked((byte)(1 << (bit % 8)));
}
public void Clear(int bit)
{
_bytes[_offset + (bit / 8)] &= unchecked((byte)~(1 << (bit % 8)));
}
//startIndex = the index of the bit at which to start fetching the value
//length = the number of bits to include - may be less than 32 in which case
//the most significant bits of the target type should be padded with 0
public int GetInt32(int startIndex, int length)
{
//NEED CODE HERE
}
//startIndex = the index of the bit at which to start storing the value
//length = the number of bits to use, if less than the number of bits required
//for the source type, precision may be lost
//value = the value to store
public void SetValue(int startIndex, int length, int value)
{
//NEED CODE HERE
}
//Other Get.../Set... methods go here
}
I am looking for any guidance in this area such as third-party libraries, algorithms for getting/setting values at arbitrary bit positions that span multiple bytes, feedback on my approach, etc. I included the class above for clarification and am not necessarily looking for code to fill it in (though I won't argue if someone wants to work it out!).
As promised, here is the class I ended up creating for this purpose. It will wrap an arbitrary byte array at an optionally specified index and allowing reading/writing at the bit level. It provides methods for reading/writing arbitrary blocks of bits from other byte arrays or for reading/writing primitive values with user-defined offsets and lengths. It works very well for my situation and solves the exact question I asked above. However, it does have a couple shortcomings. The first is that it is obviously not greatly documented - I just haven't had the time. The second is that there are no bounds or other checks. It also currently requires the MiscUtil library to provide endian conversion. All that said, hopefully this can help solve or serve as a starting point for someone else with a similar use case.
internal class BitField
{
private readonly byte[] _bytes;
private readonly int _offset;
private EndianBitConverter _bitConverter = EndianBitConverter.Big;
public BitField(byte[] bytes)
: this(bytes, 0)
{
}
//offset = the offset (in bytes) into the wrapped byte array
public BitField(byte[] bytes, int offset)
{
_bytes = bytes;
_offset = offset;
}
public BitField(int size)
: this(new byte[size], 0)
{
}
//fill == true = initially set all bits to 1
public BitField(int size, bool fill)
: this(new byte[size], 0)
{
if (!fill) return;
for(int i = 0 ; i < size ; i++)
{
_bytes[i] = 0xff;
}
}
public byte[] Bytes
{
get { return _bytes; }
}
public int Offset
{
get { return _offset; }
}
public EndianBitConverter BitConverter
{
get { return _bitConverter; }
set { _bitConverter = value; }
}
public bool this[int bit]
{
get { return IsBitSet(bit); }
set { if (value) SetBit(bit); else ClearBit(bit); }
}
public bool IsBitSet(int bit)
{
return (_bytes[_offset + (bit / 8)] & (1 << (7 - (bit % 8)))) != 0;
}
public void SetBit(int bit)
{
_bytes[_offset + (bit / 8)] |= unchecked((byte)(1 << (7 - (bit % 8))));
}
public void ClearBit(int bit)
{
_bytes[_offset + (bit / 8)] &= unchecked((byte)~(1 << (7 - (bit % 8))));
}
//index = the index of the source BitField at which to start getting bits
//length = the number of bits to get
//size = the total number of bytes required (0 for arbitrary length return array)
//fill == true = set all padding bits to 1
public byte[] GetBytes(int index, int length, int size, bool fill)
{
if(size == 0) size = (length + 7) / 8;
BitField bitField = new BitField(size, fill);
for(int s = index, d = (size * 8) - length ; s < index + length && d < (size * 8) ; s++, d++)
{
bitField[d] = IsBitSet(s);
}
return bitField._bytes;
}
public byte[] GetBytes(int index, int length, int size)
{
return GetBytes(index, length, size, false);
}
public byte[] GetBytes(int index, int length)
{
return GetBytes(index, length, 0, false);
}
//bytesIndex = the index (in bits) into the bytes array at which to start copying
//index = the index (in bits) in this BitField at which to put the value
//length = the number of bits to copy from the bytes array
public void SetBytes(byte[] bytes, int bytesIndex, int index, int length)
{
BitField bitField = new BitField(bytes);
for (int i = 0; i < length; i++)
{
this[index + i] = bitField[bytesIndex + i];
}
}
public void SetBytes(byte[] bytes, int index, int length)
{
SetBytes(bytes, 0, index, length);
}
public void SetBytes(byte[] bytes, int index)
{
SetBytes(bytes, 0, index, bytes.Length * 8);
}
//UInt16
//index = the index (in bits) at which to start getting the value
//length = the number of bits to use for the value, if less than required the value is padded with 0
public ushort GetUInt16(int index, int length)
{
return _bitConverter.ToUInt16(GetBytes(index, length, 2), 0);
}
public ushort GetUInt16(int index)
{
return GetUInt16(index, 16);
}
//valueIndex = the index (in bits) of the value at which to start copying
//index = the index (in bits) in this BitField at which to put the value
//length = the number of bits to copy from the value
public void Set(ushort value, int valueIndex, int index, int length)
{
SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
}
public void Set(ushort value, int index)
{
Set(value, 0, index, 16);
}
//UInt32
public uint GetUInt32(int index, int length)
{
return _bitConverter.ToUInt32(GetBytes(index, length, 4), 0);
}
public uint GetUInt32(int index)
{
return GetUInt32(index, 32);
}
public void Set(uint value, int valueIndex, int index, int length)
{
SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
}
public void Set(uint value, int index)
{
Set(value, 0, index, 32);
}
//UInt64
public ulong GetUInt64(int index, int length)
{
return _bitConverter.ToUInt64(GetBytes(index, length, 8), 0);
}
public ulong GetUInt64(int index)
{
return GetUInt64(index, 64);
}
public void Set(ulong value, int valueIndex, int index, int length)
{
SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
}
public void Set(ulong value, int index)
{
Set(value, 0, index, 64);
}
//Int16
public short GetInt16(int index, int length)
{
return _bitConverter.ToInt16(GetBytes(index, length, 2, IsBitSet(index)), 0);
}
public short GetInt16(int index)
{
return GetInt16(index, 16);
}
public void Set(short value, int valueIndex, int index, int length)
{
SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
}
public void Set(short value, int index)
{
Set(value, 0, index, 16);
}
//Int32
public int GetInt32(int index, int length)
{
return _bitConverter.ToInt32(GetBytes(index, length, 4, IsBitSet(index)), 0);
}
public int GetInt32(int index)
{
return GetInt32(index, 32);
}
public void Set(int value, int valueIndex, int index, int length)
{
SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
}
public void Set(int value, int index)
{
Set(value, 0, index, 32);
}
//Int64
public long GetInt64(int index, int length)
{
return _bitConverter.ToInt64(GetBytes(index, length, 8, IsBitSet(index)), 0);
}
public long GetInt64(int index)
{
return GetInt64(index, 64);
}
public void Set(long value, int valueIndex, int index, int length)
{
SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
}
public void Set(long value, int index)
{
Set(value, 0, index, 64);
}
//Char
public char GetChar(int index, int length)
{
return _bitConverter.ToChar(GetBytes(index, length, 2), 0);
}
public char GetChar(int index)
{
return GetChar(index, 16);
}
public void Set(char value, int valueIndex, int index, int length)
{
SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
}
public void Set(char value, int index)
{
Set(value, 0, index, 16);
}
//Bool
public bool GetBool(int index, int length)
{
return _bitConverter.ToBoolean(GetBytes(index, length, 1), 0);
}
public bool GetBool(int index)
{
return GetBool(index, 8);
}
public void Set(bool value, int valueIndex, int index, int length)
{
SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
}
public void Set(bool value, int index)
{
Set(value, 0, index, 8);
}
//Single and double precision floating point values must always use the correct number of bits
public float GetSingle(int index)
{
return _bitConverter.ToSingle(GetBytes(index, 32, 4), 0);
}
public void SetSingle(float value, int index)
{
SetBytes(_bitConverter.GetBytes(value), 0, index, 32);
}
public double GetDouble(int index)
{
return _bitConverter.ToDouble(GetBytes(index, 64, 8), 0);
}
public void SetDouble(double value, int index)
{
SetBytes(_bitConverter.GetBytes(value), 0, index, 64);
}
}
If your packets are always smaller than 8 or 4 bytes it would be easier to store each packet in an Int32
or Int64
. The byte array only complicates things. You do have to pay attention to High-Endian vs Low-Endian storage.
And then, for a 3 byte package:
public static void SetValue(Int32 message, int startIndex, int length, int value)
{
// we want lengthx1
int mask = (1 << length) - 1;
value = value & mask; // or check and throw
int offset = 24 - startIndex - length; // 24 = 3 * 8
message = message | (value << offset);
}
First up, it seems you have re invented the wheel with the System.Collections.BitArray class. As for actually finding the value of a specific field of bits, I think that can easily be accomplished with a little math magic of the following pseudocode:
- Start at the most distant digit in your selection (startIndex + length).
- If it is set, add 2^(distance from digit). In this case, it would be 0 (mostDistance - self = 0). So Add 2^0 (1).
- Move one bit to the left.
- Repeat for each digit in the length you want.
In that situation, if you had a bit array like so:
10001010
And you want the value of digits 0-3, you would get something like:
[Index 3] [Index 2] [Index 1] [Index 0]
(3 - 3) (3 - 2) (3 - 1) (3 - 0)
=============================================
(0 * 2^0) + (0 * 2^1) + (0 * 2^2) + (1 * 2^3) = 8
Since 1000 (binary) == 8, the math works out.
What's the problem with just using simple bit shifts to get your values out?
int data = Convert.ToInt32( "110000000010000000100001", 2 );
bool v1 = ( data & 1 ) == 1; // True
int v2 = ( data >> 1 ) & 0x1F; // 16
int v3 = ( data >> 6 ) & 0xFF; // 128
uint v4 = (uint )( data >> 14 ) & 0x1FF; // 256
bool v5 = ( data >> 23 ) == 1; // True
This is a pretty good article covering the subject. it's in C, but the same concepts still apply.
精彩评论