I have a very, very large unsorted string array and i need to check 开发者_如何学运维to see if there are duplicates.
What is the most efficient method of checking this?
The simplest way is probably:
if (strings.Length != strings.Distinct().Count())
{
// There are duplicates
}
That will be O(n) - but it won't tell you which items were duplicated.
Alternatively:
HashSet<string> values = new HashSet<string>();
foreach (string x in strings)
{
if (!values.Add(x))
{
// x was a duplicate
}
}
Again, this should be amortized O(n).
Note that you can specify a different IEqualityComparer<string>
if you want a case-insensitive comparison, or something like that.
Loop through the list, and put each element in a sorted tree. This way, you can detect early whether there is a duplicate.
精彩评论