开发者

How does the C# compiler work with a split?

开发者 https://www.devze.com 2023-03-30 18:53 出处:网络
I have an List<string> that I am iterating through and splitting on each item then adding it to a StringBuilder.

I have an List<string> that I am iterating through and splitting on each item then adding it to a StringBuilder.

foreach(string part in List)
{
   StringBuilder.Append(part.Split(':')[1] + " ");
}

So my question is how many strings are creat开发者_运维问答ed by doing this split? All of the splits are going to produce two items. So... I was thinking that it will create a string[2] and then an empty string. But, does it then create the concatenation of the string[1] + " " and then add it to the StringBuilder or is this optimized?


The code is actually equivalent to this:

foreach(string part in myList)
{
   sb.Append(string.Concat(part.Split(':')[1], " "));
}

So yes, an additional string, representing the concatenation of the second part of the split and the empty string will be created.

Including the original string, you also have the two created by the call to Split(), and a reference to the literal string " ", which will be loaded from the assembly metadata.

You can save yourself the call to Concat() by just Appending the split result and the empty string sequentially:

sb.Append(part.Split(':')[1]).Append(" ");

Note that if you are only using string literals, then the compiler will make one optimzation for you:

sb.Append("This is " + "one string");

is actually compiled to

sb.Append("This is one string");


3 extra strings for every item

  • part[0];
  • part[1];
  • part[1] + " "

the least allocations possible would be to avoid all the temporary allocations completely, but the usual micro-optimization caveats apply.

var start = part.IndexOf(':') + 1;
stringbuilder.Append(part, start, part.Length-start).Append(' ');


You have the original string 'split' - 1 string

You have the 'split' split into two - 2 string

You have the two parts of split joined - 1 string

The string builder does not create a new string.

The current code uses 4 strings, including the original.

If you want to save one string do:

StringBuilder.Append(part.Split(':')[1]);
StringBuilder.Append(" ");


This code:

foreach(string part in List)
{
   StringBuilder.Append(part.Split(':')[1] + " ");
}

Is equivalent to:

foreach(string part in List)
{
   string tmp = string.Concat(part.Split(':')[1], " ");
   StringBuilder.Append(tmp);
}

So yes, it's creating a string needlessly. This would be better, at least in terms of the number of strings generated:

foreach(string part in List)
{
   StringBuilder.Append(part.Split(':')[1])
                .Append(" ");
}


So for each value in the list (n, known as part in your code) you are allocating:

  1. x (I assume 2) strings for the split.
  2. n strings for the concatenation.
  3. Roughly n + 1 string for the StringBuilder; probably much less though.

So you have nx + n + n + 1 at the end, and assuming the split always results in two values 4n + 1.

One way to improve this would be:

foreach(string part in List) 
{
    var val = part.Split(':')[1];
    StringBuilder.EnsureCapacity(StringBuilder.Length + val.Length + 1);
    StringBuilder.Append(val);
    StringBuilder.Append(' ');
}

This makes it 3n + 1. It is a rough estimate as StringBuilder allocates strings as it runs out of space - but if you EnsureCapacity you will prevent it from getting it wrong.


Probably the only way to be sure about how this is compiled is to build it and decompile it again with Refactor to see how it's internally handled. Anyway have in mind that probably it does not have impact on the whole app performance.

0

精彩评论

暂无评论...
验证码 换一张
取 消