开发者

How to rename duplicates in list using LINQ

开发者 https://www.devze.com 2023-03-30 07:09 出处:网络
I need a list of unique values. But it could be, the values exists twice or more in the list. If this occurs I must rename the value but the renamed value also could be in the list.

I need a list of unique values. But it could be, the values exists twice or more in the list. If this occurs I must rename the value but the renamed value also could be in the list.

It is pos开发者_StackOverflow中文版sible to rename the values using a LINQ query so i don't need a sub-query?

example 1: before: "one", "one", "two", "two", "three" after: "one", "one_", "two", "two_", "three"

example 2: before: "one", "one", "one_" after: "one", "one_", "one__"

The 3rd "one" has 2 underscores because the 2nd "one" was renamed to "one_".

Thanks a lot for an idea...


I don't think this should be done with simply a linq-query. I'd use a HashSet and create a function if I was you. Something like this:

IEnumerable<String> GetUnique(IEnumerable<String> list) {
    HashSet<String> itms = new HashSet<String>();
    foreach(string itm in list) {
         string itr = itm;
         while(itms.Contains(itr)) {
             itr = itr + "_";
         }
         itms.Add(itr);
         yield return itr;
    }
}

[Edit]

This could be made into an extension-method though, so you could call it like this: myList.GetUnique(); (or something like that)

[Edit 2]

Fixed bug with iterator-variable being changed.


I would create a new extension method like this:

public static IEnumerable<string> Uniquifier(this IEnumerable<string> values)
{
    if (values == null) throw new ArgumentNullException("values");

    var unique = new HashSet<string>();

    foreach(var item in values)
    {
        var newItem = item;

        while(unique.Contains(newItem))
        {
            newItem += '_';
        }

        unique.Add(newItem);

        yield return newItem;
    }
}

This will take any sequence of string, and create a HashSet - very fast, O(1) - of the values. If the value already exists, it appends a '_' and tries again. Once it has one unique, returns it.


Using an extension method:

public static class EnumerableExtensions
{
    public static IEnumerable<string> Uniquify(this IEnumerable<string> enumerable, string suffix)
    {
        HashSet<string> prevItems = new HashSet<string>();
        foreach(var item in enumerable)
        {
            var temp = item;
            while(prevItems.Contains(temp))
            {
                temp += suffix;
            }
            prevItems.Add(temp);
            yield return temp;
        }
    }
}

usage:

var test1 = new[] {"one","one","two","two","three"};
Console.WriteLine(String.Join(",",test1.Uniquify("_")));

Live example: http://rextester.com/rundotnet?code=BYFVK87508

Edit: Using the while loop now supports all cases previously unsupported as per comments below.

0

精彩评论

暂无评论...
验证码 换一张
取 消