Generating parsers for binary structured data; metaprogramming or external scripts?_问答_开发者

Generating parsers for binary structured data; metaprogramming or external scripts?

开发者 https://www.devze.com 2023-02-21 20:34 出处：网络

I\'m writing a server that interfaces with a proprietary protocol. Currently most of the code consists of packet handlers that parse all of the fields of a packet, while making sure that the size of t

I'm writing a server that interfaces with a proprietary protocol. Currently most of the code consists of packet handlers that parse all of the fields of a packet, while making sure that the size of the data available is at least the minimum remaining size after each field. In addition to that, the packet handlers also do validity checks on the received data (i.e. must be in a certain range, or be in a set of predefined values).

Certainly this is a lot of boilerplate code when you combine it with the actual logic handling of the packet, so I would like to generate the parsers automatically and invoke the handlers on fully parsed structures.

Right now I see two approaches that I could take:

Come up with some metaprogramming framework that allows me to describe packet structures and eventually rules for data validation so that I can generate the parsing code at compile time. I guess this would be similar in intent to what Boost.Spirit does.
Write my own data description language and an external tool that will generate C++ code from it. Doesn't seem too hard but would certainly clutter up the build process and I generally dislike using large amounts of tool-generated code. Also this wouldn't permit开发者_如何转开发 quickly changing data descriptions inside the source code itself.

The metaprogramming way seems superior in theory, but I haven't thought out a flawless way of implementing this yet. Preferably declaring packets would be similar to declaring a class and would not be full of macros. There's also a problem in cases where I have to refer to previous data members (which is the case for fields repeated a variable number of times, where the count is specified earlier in the packet).

Does anyone have experience with similar frameworks, and what would you suggest?

I know about Google Protocol Buffers but that is intrusive in that it requires being in control of the protocol.

I've gone the route of creating my own language and tooling for binary structured data multiple times in the past, but that was in part driven by the need to support multiple target languages from the data definitions (at the time, C# and C++); I also created a third target to produce HTML reference documentation from the definitions.

The main advantage I can see in using C++ template metaprogramming is that you can directly interact with the compile-time type system if and when that is useful. For typical binary structured data, though, I've never found it to be all that useful. For example, you'd need a way to process the relevant members in a specific order; Boost serialization does that by requiring a serialization method that specifies which members are processed and in what order.