开发者

IEnumerable question: Best performance?

开发者 https://www.devze.com 2022-12-11 11:17 出处:网络
Quick question: Which one is faster? foreach (Object obj in Collection) { if(obj.Mandatory){ ... } } or foreach (Object obj i开发者_StackOverflown Collection.FindAll(o => o.Mandatory))

Quick question:

Which one is faster?

foreach (Object obj in Collection)
{
     if(obj.Mandatory){ ... }
}

or

foreach (Object obj i开发者_StackOverflown Collection.FindAll(o => o.Mandatory))
{
...
}

and if you know a faster suggestion, i'd be pleased to know.

Thank you


If your Collection is a List<T> then FindAll is implemented by creating a new List<T> and copying all items that match the predicate. This is obviously slower than just enumerating the collection and deciding for each item if the predicate holds.

If you're using .NET 3.5 you can use LINQ which will not create a copy and and is similar to your first example:

foreach (object obj in someCollection.Where(o => o.Mandatory))
{
    ...
}

Note this isn't necessarily the fastest solution. It's just easy to see that a method that allocates memory and enumerates a collection is slower than a method that only enumerates a collection. If performance is critical: measure it.


The following test code prints the system ticks (1 tick = 100 nanoseconds) for iterating through 10 million objects. The FindAll is slowest and the for loop is fastest as expected.

But the overhead of the iteration is measured in nanoseconds per item even in the worst case. If you're doing anything significant in the loop (e.g. something which takes a microsecond per item), then the speed difference of the iteration is completely insignificant.

So for the love of Turing don't forbid foreach in your coding guidelines now. It doesn't make any practical difference, and the LINQ statements sure are easier to read.

   public class Test
   {
      public bool Bool { get; set; }
   }

   class Program
   {

      static void Main(string[] args)
      {
         // fill test list
         var list = new List<Test>();
         for (int i=0; i<1e7; i++)
         {
            list.Add(new Test() { Bool = (i % 2 == 0) });
         }

         // warm-up
         int counter = 0;
         DateTime start = DateTime.Now;
         for (int i = 0; i < list.Count; i++)
         {
            if (list[i].Bool)
            {
               counter++;
            }
         }

         // List.FindAll
         counter = 0;
         start = DateTime.Now;
         foreach (var test in list.FindAll(x => x.Bool))
         {
            counter++;
         }
         Console.WriteLine(DateTime.Now.Ticks - start.Ticks); // prints 7969158

         // IEnumerable.Where
         counter = 0;
          start = DateTime.Now;
         foreach (var test in list.Where(x => x.Bool))
         {
            counter++;
         }
         Console.WriteLine(DateTime.Now.Ticks - start.Ticks); // prints 5156514

         // for loop
         counter = 0;
         start = DateTime.Now;
         for (int i = 0; i < list.Count; i++)
         {
            if (list[i].Bool)
            {
               counter++;
            }
         }
         Console.WriteLine(DateTime.Now.Ticks - start.Ticks); // prints 2968902


      }


The first will be somewhat faster.

In the second case, you're using List<T>.FindAll to create a temporary list that matches your criteria. This copies the list, then iterates over it.

However, you could accomplish the same thing, with the same speed as your first option, by doing:

foreach (Object obj in Collection.Where(o => o.Mandatory))
{
}

This is because Enumerable.Where uses streaming to return an IEnumerable<T>, which is generated as you iterate. No copy is made.


The fastest you could ever get without parallelizing the enumeration into multiple threads taking accounts of number of processors, etc:

for (int i = 0; i < Collection.Count; i++)
{
    var item = Collection[i];
    if (item.Mandatory) { ... }
}

I would recommend you though to always use Linq instead of writing for or foreach loops because in the future it will become so intelligent that it will actually be capable of distributing the work over processors and take into account hardware specific things (see PLinq) and it will eventually be faster than if you wrote the loops yourself: declarative vs imperative programing.


FindAll is just syntactic sugar. For example:

    List<string> myStrings = new List<string>();
    foreach (string str in myStrings.FindAll(o => o.Length > 0))
    {

    }

Compiles to:

List<string> list = new List<string>();
if (CS$<>9__CachedAnonymousMethodDelegate1 == null)
{
    CS$<>9__CachedAnonymousMethodDelegate1 = new Predicate<string>(MyClass.<RunSnippet>b__0);
}
using (List<string>.Enumerator enumerator = list.FindAll(CS$<>9__CachedAnonymousMethodDelegate1).GetEnumerator())
{
    while (enumerator.MoveNext())
    {
        string current = enumerator.Current;
    }
}

public List<T> FindAll(Predicate<T> match)
{
    if (match == null)
    {
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.match);
    }
    List<T> list = new List<T>();
    for (int i = 0; i < this._size; i++)
    {
        if (match(this._items[i]))
        {
            list.Add(this._items[i]);
        }
    }
    return list;
}

private static bool <RunSnippet>b__0(string o)
{
    return (o.Length > 0);
}


If performance is in question this probably is not the bottleneck, however, have you considered using the parallel library or PLINQ? see below:

Parallel.ForEach(Collection, obj =>
{
    if (obj.Mandatory)
    {
        DoWork();
    }
});

http://msdn.microsoft.com/en-us/library/dd460688(v=vs.110).aspx

Also, although perhaps slightly unrelated it seems as though performance peeks your curiousity, if you are dealing with very large sets of data, a binary search may be useful. In my case I have two separate lists of data. I have to deal with lists of millions of records and this saved me literally an exponential amount of time per execution. The only downside is that it is ONLY useful for very large collections and is required to be sorted beforehand. You will also notice that this makes use of the ConcurrentDictionary class, which provides significant overhead, but it is thread safe and was required due to the requirements and the number of threads I am managing asynchronously.

private ConcurrentDictionary<string, string> items;
private List<string> HashedListSource { get; set; }
private List<string> HashedListTarget { get; set; }

this.HashedListTarget.Sort();
this.items.OrderBy(x => x.Value);

private void SetDifferences()
{
    for (int i = 0; i < this.HashedListSource.Count; i++)
    {
        if (this.HashedListTarget.BinarySearch(this.HashedListSource[i]) < 0)
        {
            this.Mismatch.Add(items.ElementAt(i).Key);
        }
    }
}

IEnumerable question: Best performance?

This image was originally posted in a great article found here: http://letsalgorithm.blogspot.com/2012/02/intersecting-two-sorted-integer-arrays.html

Hope this helps!

0

精彩评论

暂无评论...
验证码 换一张
取 消