开发者

Better ways of improving code serialization speed

开发者 https://www.devze.com 2022-12-09 11:53 出处:网络
I have the following code that serializes a List to a byte array for transport via Web Services.The code works relatively fast on smaller entities, but this is a list of 60,000 or so items.It takes se

I have the following code that serializes a List to a byte array for transport via Web Services. The code works relatively fast on smaller entities, but this is a list of 60,000 or so items. It takes several seconds to execute the formatter.Serialize method. Anyway to speed this up?

public static byte[] ToBinary(O开发者_运维百科bject objToBinary)
{
    using (MemoryStream memStream = new MemoryStream())
    {
        BinaryFormatter formatter = new BinaryFormatter(null, new StreamingContext(StreamingContextStates.Clone));
        formatter.Serialize(memStream, objToBinary);
        memStream.Seek(0, SeekOrigin.Begin);
        return memStream.ToArray();
    }
}


The inefficiency you're experiencing comes from several sources:

  1. The default serialization routine uses reflection to enumerate object fields and get their values.
  2. The binary serialization format stores things in associative lists keyed by the string names of the fields.
  3. You've got a spurious ToArray in there (as Danny mentioned).

You can get a pretty big improvement off the bat by implementing ISerializable on the object type that is contained in your List. That will cut out the default serialization behavior that uses reflection.

You can get a little more speed if you cut down the number of elements in the associative array that holds the serialized data. Make sure the elements you do store in that associative array are primitive types.

Finally, you can eliminate the ToArray but I doubt you'll even notice the bump that gives you.


if you want some real serialization speed , consider using protobuf-net which is the c# version of google's protocol buffers. it's supposed to be an order of magnitude faster that binary formatter.


It would probably be much faster to serialize the entire array (or collection) of 60,000 items in one shot, into a single large byte[] array, instead of in separate chunks. Is having each of the individual objects be represented by its own byte[] array a requirement of other parts of the system you're working within? Also, are the actual Type's of the objects known? If you were using a specific Type (maybe some common base class to all of these 60,000 objects) then the framework would not have to do as much casting and searching for your prebuilt serialization assemblies. Right now you're only giving it Object.


.ToArray() creates a new array, it more be more effcient to copy the data to an existing array using unsafe methods (such as accessing the stream's memory using fixed, then copying the memory using MemCopy() via DllImport).

Also consider using a faster custom formatter.


I started a code-generator project, that includes a binary DataContract-Serialzer that beats at least Json.NET by a factor of 30. All you need are the generator nuget package and an additional lib that comes with faster replacements of BitConverter.

Then you create a partial class and decorate it with DataContract and each serializable property with DataMember. The generator will then create a ToBytes-method and together with the additional lib you can serialize collections as well. Look at my example from this post:

var objects = new List<Td>();
for (int i = 0; i < 1000; i++)
{
    var obj = new Td
    {
        Message = "Hello my friend",
        Code = "Some code that can be put here",
        StartDate = DateTime.Now.AddDays(-7),
        EndDate = DateTime.Now.AddDays(2),
        Cts = new List<Ct>(),
        Tes = new List<Te>()
    };
    for (int j = 0; j < 10; j++)
    {
        obj.Cts.Add(new Ct { Foo = i * j });
        obj.Tes.Add(new Te { Bar = i + j });
    }
    objects.Add(obj);
}

With this generated ToBytes() method:

public int Size
{
    get 
    { 
        var size = 24;
        // Add size for collections and strings
        size += Cts == null ? 0 : Cts.Count * 4;
        size += Tes == null ? 0 : Tes.Count * 4;
        size += Code == null ? 0 : Code.Length;
        size += Message == null ? 0 : Message.Length;

        return size;              
    }
}

public byte[] ToBytes(byte[] bytes, ref int index)
{
    if (index + Size > bytes.Length)
        throw new ArgumentOutOfRangeException("index", "Object does not fit in array");

    // Convert Cts
    // Two bytes length information for each dimension
    GeneratorByteConverter.Include((ushort)(Cts == null ? 0 : Cts.Count), bytes, ref index);
    if (Cts != null)
    {
        for(var i = 0; i < Cts.Count; i++)
        {
            var value = Cts[i];
            value.ToBytes(bytes, ref index);
        }
    }
    // Convert Tes
    // Two bytes length information for each dimension
    GeneratorByteConverter.Include((ushort)(Tes == null ? 0 : Tes.Count), bytes, ref index);
    if (Tes != null)
    {
        for(var i = 0; i < Tes.Count; i++)
        {
            var value = Tes[i];
            value.ToBytes(bytes, ref index);
        }
    }
    // Convert Code
    GeneratorByteConverter.Include(Code, bytes, ref index);
    // Convert Message
    GeneratorByteConverter.Include(Message, bytes, ref index);
    // Convert StartDate
    GeneratorByteConverter.Include(StartDate.ToBinary(), bytes, ref index);
    // Convert EndDate
    GeneratorByteConverter.Include(EndDate.ToBinary(), bytes, ref index);
    return bytes;
}

It serializes each object in ~1.5 micro seconds -> 1000 objects in 1,7ms.

0

精彩评论

暂无评论...
验证码 换一张
取 消