I have a custom binary file which I want to read into my C# program.
There are several different formats, some MSB 开发者_运维百科first, some LSB first and some with the variables in different orders.
Currently, I have a class which reads the right number of bytes, one at a time.
It is very slow and so I am looking to improve performance any way I can.
Is serialization likely to perform better? If so, is this possible with the scenario I have decsribed? Is it possible to customise the BinaryFormatter for big/little-endian format?
Thanks.
You can't do that with BinaryFormatter
- it will expect additional meta-data/padding around object. You would have to read manually either from a Stream
or similarly via a binary reader.
Having done some very similar code, I would write my own reader that sits on top of a stream, with methods like: ReadInt32LittleEndian
, ReadInt32BigEndian
(etc for everything you need) - and use shift (<<
/ >>
) to assemble the bytes. But importantly I would use a backing buffer to reduce the amount of calls to the underlying stream (even with a buffer, this can be unacceptably slow).
Let me refer you to some code from protobuf-net that does this... in particular ProtoReader
, taking an example:
/// <summary>
/// Reads an unsigned 32-bit integer from the stream; supported wire-types: Variant, Fixed32, Fixed64
/// </summary>
public uint ReadUInt32()
{
switch (wireType)
{
case WireType.Variant:
return ReadUInt32Variant(false);
case WireType.Fixed32:
if (available < 4) Ensure(4, true);
position += 4;
available -= 4;
return ((uint)ioBuffer[ioIndex++])
| (((uint)ioBuffer[ioIndex++]) << 8)
| (((uint)ioBuffer[ioIndex++]) << 16)
| (((uint)ioBuffer[ioIndex++]) << 24);
case WireType.Fixed64:
ulong val = ReadUInt64();
checked { return (uint)val; }
default:
throw CreateException();
}
}
(here wireType
broadly acts as an indicater of endianness etc, but that isn't important)
Looking at the Fixed32
implementation:
- The
Ensure
makes sure that we have at least 4 more bytes in our backing buffer (fetching more if we desire) - we increment some counters so we can track our position in the logical buffer
- we read the data from the buffer
One you have a reader for your format, deserialization should be much easier.
No it wont work. Well it could, but the overhead from a transformation will likely kill performance.
精彩评论