开发者

.net Generic List optimization relating to capacity

开发者 https://www.devze.com 2023-03-04 09:52 出处:网络
I\'m currently workin开发者_开发知识库g with an app that will do the following. // Initialize a list:

I'm currently workin开发者_开发知识库g with an app that will do the following.

// Initialize a list: 
myList = new List<aPoint>;
while(WeHaveMoreData)
   myList->Add(ReturnNext1000Points());

I have no way of knowing the total size of the list from the beginning. From what I've read, List<> is the best way to handle this much data incoming (could be upwards of 500k records).

I'm wondering if I should handle the capicity of the list (give it initial values, or increase the cap if it needs it)?

How do I approach optimizing such a procedure?


If you have an approximation of the total records you could set the capacity of the list otherwise leave it to grow. It is pretty optimized, just ensure you don't run out of memory. Another approach is to use a lazy iterator which won't load the entire list in memory:

public IEnumerable<aPoint> GetPoints()
{
    while(WeHaveMoreData)
    {
        yield return new aPoint();
    }
}

It is only once you start iterating that records will begin to be fetched, one by one and released immediately:

foreach (var point in GetPoints())
{
    /// TODO: do something with the point
}


First rule: premature optimization is the root of all evil. If perfomance is not an issue leave it as is. Overwise you should try set initial size of list to about AverageExpectedSize/0.7.


I also think you can't optimize it much.. I guess you could do slightly better in some specific cases, so I have a question - what do you do with the data afterwards? Also - do you want to optimize for memory or speed?

A typical list implementation will grow capacity by a factor of 2x every time, so maybe you could save some space by having a List<aPoint[]>, which would have much fewer elements, so it would be less likely that you have a few 100k of spare capacity. But that would only matter if you were just about to run out of memory - it's likely that much more memory is spent on the data itself in any case..


In general, I would say that if you don't know the number of elements within say +/- 20% then you are probably should just add blindly to the List instead of guessing the capacity.

List is different than an array when it comes to matters of adding when at capacity. Remember that the List will double its capacity once you exceed the capacity. So for example, if your list has a current capacity of 128 elements and you add an element that makes it 129 elements, the list will resize its capacity to 256 elements. Then for the next 128 Adds you don't resize the list at all. Once you get to 257, it will double to 512, and the process repeats itself.

Thus you will have O(log(n)) resizes to your list.

0

精彩评论

暂无评论...
验证码 换一张
取 消