开发者

How to parse/encode binary message formats?

开发者 https://www.devze.com 2023-03-23 09:55 出处:网络
I need to parse and encode to a legacy binary message format in Java. I began by using DataOutputStream to read/write primitive typ开发者_Python百科es but the problem I\'m having is that the message f

I need to parse and encode to a legacy binary message format in Java. I began by using DataOutputStream to read/write primitive typ开发者_Python百科es but the problem I'm having is that the message format doesn't align nicely to byte offsets and includes bit flags.

For example I have to deal with messages like this:

+----------+---+---+----------+---------+--------------+
+uint32    +b   +b + uint32   +4bit enum+32 byte string+
+----------+---+---+----------+---------+--------------+

Where (b) is a one bit flag. The problem being that java primitive types don't align to byte boundaries so I wouldn't be able to use DataOutputStream to encode this since the lowest level type I can write is a byte.

Are there any libraries, standard or 3rd party, for dealing with arbitrary bit level message formats?

Edit: Thanks to @Software Monkey for forcing me to look at my spec more closely. The spec I am using does actually align on byte boundaries so DataOutputStream is appropriate. Given my original question though I would have gone with the solution proposed by @emboss.

Edit: Although the message format for this question was discovered to be on byte boundaries I've come across another message format that is applicable to the original question. This format defines a 6 bit character mapping where each character really only takes up 6 bits, not the full byte, so character strings do not align on byte boundaries. I have discovered several binary output streams that tackle this problem. Like this one: http://introcs.cs.princeton.edu/java/stdlib/BinaryOut.java.html


There is a builtin byte type in Java, and you can read into byte[] buffers just fine using InputStream#read(byte[]) and write to an OutputStream using OutputStream#write(byte[], int, int), so there's no problem in that.

Regarding your messages - as you noted correctly, the tiniest bit of information you get at a time is a byte, so you will have to decompose your message format into 8 bit chunks first:

Let's suppose your message is in a byte[] named data. I also assume little-endianness.

A uint32 is 32 bits long -> that's four bytes. (Be careful when parsing this in Java, Java integers and longs are signed, you need to handle that. An easy way to avoid trouble would be taking longs for that. data[0] fills bits 31 - 24, data[1] 23 - 16, data[2] bits 15 - 8 and data[3] bits 7 to 0. So you need to shift them appropriately to the left and glue them together with logical OR:

long uint32 = ((data[0]&0xFF) << 24) | 
              ((data[1]&0xFF) << 16) | 
              ((data[2]&0xFF) << 8)  | 
               (data[3]&0xFF);

Next, there are two single bits. I suppose you have to check whether they are "on" (1) or "off" (0). To do this, you use bit masks and compare your byte with logical AND.

First bit: ( binary mask | 1 0 0 0 0 0 0 0 | = 128 = 0x80 )

if ( (data[4] & 0x80 ) == 0x80 ) // on

Second bit: ( binary mask | 0 1 0 0 0 0 0 0 | = 64 = 0x40 )

if ( (data[4] & 0x40 ) == 0x40 ) // on

To compose the next uint32, you will have to compose bytes over byte boundaries of the underlying data. E.g. for the first byte take the remaining 6 bits of data[4], shift them two to the left (they will be bit 8 to 2 of the uint32) and "add" the first (highest) two of data[5] by shifting them 6 bits to the right (they will take the remaining 1 and 0 slot of the uint32). "Adding" means logically OR'ing:

byte uint32Byte1 = (byte)( (data[4]&0xFF) << 2 | (data[5]&&0xFF) >> 6);

Building your uint32 is then the same procedure as in the first example. And so on and so forth.


with Java Binary Block Parser the script to parse the message will be

  class Parsed {
    @Bin int field1;
    @Bin (type = BinType.BIT) boolean field2;
    @Bin(type = BinType.BIT) boolean field3;
    @Bin int field4;
    @Bin(type = BinType.BIT) int enums;
    @Bin(type = BinType.UBYTE_ARRAY) String str;
  }

  Parsed parsed = JBBPParser.prepare("int field1; bit field2; bit field3; int field4; bit:4 enums; ubyte [32] str;").parse(STREAM).mapTo(Parsed.class);


I've heard nice things about Preon.


Just to add to pholser's answer, I think the Preon version would be something like this:

class DataStructure {
  @BoundNumber(size="32")  long       first; // uint32
  @Bound                   boolean    second; // boolean
  @Bound                   boolean    third; // boolean
  @BoundNumber(size="32")  long       fourth; // uint32
  @BoundNumber(size="4")   int        fifth; // enum
  @BoundString(size="32")  String     sixth; // string
}

... but in reality, you can make your life even easier by using Preon's support for dealing with enumerations directly.

Creating a Codec for it and using it to decode some data would be something like this:

Codec<DataStructure> codec = Codecs.create(DataStructure.class)
DataStructure data = Codecs.decode(codec, ....)


You need to apply bit arithmetics (AND, OR, AND NOT operators) to change or read single bits within a byte in Java. Arithmetic operators are &, | and ~

0

精彩评论

暂无评论...
验证码 换一张
取 消