开发者

C# merge distinct items of 2 collections

开发者 https://www.devze.com 2023-03-05 23:23 出处:网络
I\'m looking for a performant w开发者_如何学Pythonay to add distinct items of a second ICollection to an existing one. I\'m using .NET 4.This should do it:

I'm looking for a performant w开发者_如何学Pythonay to add distinct items of a second ICollection to an existing one. I'm using .NET 4.


This should do it:

list1.Union(list2).Distinct(aCustomComparer).ToList()


As long as they're IEnumerable, you can use the go-to Linq answer:

var union = firstCollection.Union(secondCollection);

This will use the default equality comparison, which for most objects is referential equality. To change this, you can define an IEqualityComparer generic to the item type in your collection that will perform a more semantic comparison, and specify it as the second argument of the Union.


Another way to add to your exisiting list would be:

list1.AddRange(list2.Distinct().Except(list1));


The most direct answer to your question - since you didn't give much detail on the actual types of ICollection you have as input or need as output is the one given by KeithS

var union = firstCollection.Union(secondCollection);

This will return a distinct IEnumerable - if that is what you need then it is VERY fast. I made a small test app (below) that ran the union method (MethodA) against a simple hashset method of deduplicating and returns a Hashset<>(MethodB). The union method DESTROYS the hashset:

MethodA: 1ms

MethodB: 2827ms

However -- Having to convert that IEnumerable to some other type of collection such as List<> (like the version ADas posted) changes everything:

Simply adding .ToList() to MethodA

var union = firstCollection.Union(secondCollection).ToList();

Changes the results:

MethodA: 3656ms

MethodB: 2803ms

So - it seems more would need to be known about the specific case you are working with - and any solution you come up with should be tested - since a small (code) change can have HUGE impacts.

Below is the test I used to compare these methods - I'm sure it is a stupid way to test - but it seems to work :)

    private static void Main(string[] args)
    {
        ICollection<string> collectionA = new List<string>();
        ICollection<string> collectionB = new List<string>();
        for (int i = 0; i < 1000; i++)
        {
            string randomString = Path.GetRandomFileName();
            collectionA.Add(randomString);
            collectionA.Add(randomString);
            collectionB.Add(randomString);
            collectionB.Add(randomString);
        }
        Stopwatch testA = new Stopwatch();
        testA.Start();
        MethodA(collectionA, collectionB);
        testA.Stop();


        Stopwatch testB = new Stopwatch();
        testB.Start();
        MethodB(collectionA, collectionB);
        testB.Stop();

        Console.WriteLine("MethodA: {0}ms", testA.ElapsedMilliseconds);
        Console.WriteLine("MethodB: {0}ms", testB.ElapsedMilliseconds);
        Console.ReadLine();
    }

    private static void MethodA(ICollection<string> collectionA, ICollection<string> collectionB)
    {
        for (int i = 0; i < 10000; i++)
        {
            var result = collectionA.Union(collectionB);
        }
    }

    private static void MethodB(ICollection<string> collectionA, ICollection<string> collectionB)
    {
        for (int i = 0; i < 10000; i++)
        {
            var result = new HashSet<string>(collectionA);
            foreach (string s in collectionB)
            {
                result.Add(s);
            }
        }
    }
0

精彩评论

暂无评论...
验证码 换一张
取 消