开发者

Find character with most occurrences in string?

开发者 https://www.devze.com 2023-02-12 04:45 出处:网络
For e开发者_运维问答xample, I have a string: \"abbbbccd\" b has the most occurrences. When using C++, the easiest way to handle this is inserting each character into a map<>. Do I have to do

For e开发者_运维问答xample, I have a string:

"abbbbccd"

b has the most occurrences. When using C++, the easiest way to handle this is inserting each character into a map<>. Do I have to do the same thing in C#? Is there an elegant way to do it using LINQ?


input.GroupBy(x => x).OrderByDescending(x => x.Count()).First().Key

Notes:

  • if you need this to work on ancient (2.0) versions of .Net consider LinqBridge. If you can't use C# 3.0 (targeting .Net 2.0) you probably better off with other solutions due to missing lambda support. Another .Net 2.0+ option is covered in xanatos answer.
  • for the case of "aaaabbbb" only one of those will be returned (thanks xanatos for comment). If you need all of the elements with maximum count, use Albin's solution instead.
  • due to sorting this if O(n log n) solution. If you need better than that - find Max value by linear search instead of sorting first which will give O(n). See LINQ: How to perform .Max() on a property of all objects in a collection and return the object with maximum value


This because someone asked for a 2.0 version, so no LINQ.

Dictionary<char, int> dict = new Dictionary<char, int>();

int max = 0;

foreach (char c in "abbbbccccd")
{
    int i;
    dict.TryGetValue(c, out i);
    i++;
    if (i > max)
    {
        max = i;
    }
    dict[c] = i;
}

foreach (KeyValuePair<char, int> chars in dict)
{
    if (chars.Value == max)
    {
        Console.WriteLine("{0}: {1}", chars.Key, chars.Value);
    }
}

Instead this for the LINQ version. It will extract paired "bests" (aaaabbbb == a, b). It WON'T work if str == String.Empty.

var str = "abbbbccccd";

var res = str.GroupBy(p => p).Select(p => new { Count = p.Count(), Char = p.Key }).GroupBy(p => p.Count, p => p.Char).OrderByDescending(p => p.Key).First();

foreach (var r in res) {
    Console.WriteLine("{0}: {1}", res.Key, r);
}


string testString = "abbbbccd";
var charGroups = (from c in testString
                    group c by c into g
                    select new
                    {
                        c = g.Key,
                        count = g.Count(),
                    }).OrderByDescending(c => c.count);
foreach (var group in charGroups)
{
    Console.WriteLine(group.c + ": " + group.count);
}


Inspired from Stephen's answer, almost the same:

public static IEnumerable<T> Mode<T>(this IEnumerable<T> input)
{
    var dict = input.ToLookup(x => x);
    if (dict.Count == 0)
        return Enumerable.Empty<T>();
    var maxCount = dict.Max(x => x.Count());
    return dict.Where(x => x.Count() == maxCount).Select(x => x.Key);
}

var modes = "".Mode().ToArray(); //returns { }
var modes = "abc".Mode().ToArray(); //returns { a, b, c }
var modes = "aabc".Mode().ToArray(); //returns { a }
var modes = "aabbc".Mode().ToArray(); //returns { a, b }

Update: Did a quick benchmarking of this answer vs Jodrell's answer (release build, debugger detached, oh yes)

source = "";

iterations = 1000000

result:

this - 280 ms
Jodrell's - 900 ms

source = "aabc";

iterations = 1000000

result:

this - 1800 ms
Jodrell's - 3200 ms

source = fairly large string - 3500+ char

iterations = 10000

result:

this - 3200 ms
Jodrell's - 3000 ms


EDIT 3

Here is my last answer which I think (just) shades Nawfal's for performance on longer sequences.

However, given the reduced complexity of Nawfal's answer, and its more universal performance, especially in relation to the question, I'd choose that.

public static IEnumerable<T> Mode<T>(
    this IEnumerable<T> source,
    IEqualityComparer<T> comparer = null)
{
    var counts = source.GroupBy(t => t, comparer)
        .Select(g => new { g.Key, Count = g.Count() })
        .ToList();

    if (counts.Count == 0)
    {
        return Enumerable.Empty<T>();
    }

    var maxes = new List<int>(5);
    int maxCount = 1;

    for (var i = 0; i < counts.Count; i++)
    {
        if (counts[i].Count < maxCount)
        {
            continue;
        }

        if (counts[i].Count > maxCount)
        {
            maxes.Clear();
            maxCount = counts[i].Count;
        }

        maxes.Add(i);
    }

    return maxes.Select(i => counts[i].Key);
}

EDIT 2


EDIT



If you want an efficient generic solution, that accounts for the fact that multiple items could have the same frequency, start with this extension,

IOrderedEnumerable<KeyValuePair<int, IEnumerable<T>>>Frequency<T>(
    this IEnumerable<T> source,
    IComparer<T> comparer = null)
{
    return source.GroupBy(t => t, comparer)
        .GroupBy(
            g => g.Count(),
            (k, s) => new KeyValuePair<int, IEnumerable<T>>(
                k,
                s.Select(g => g.First())))
        .OrderByDescending(f => f.Key);
}

This extension works in all of the following scenarios

var mostFrequent = string.Empty.Frequency().FirstOrDefault();

var mostFrequent = "abbbbccd".Frequency().First();

or,

var mostFrequent = "aaacbbbcdddceee".Frequency().First();

Note that mostFrequent is a KeyValuePair<int, IEnumerable<char>>.


If so minded you could simplify this to another extension,

public static IEnumerable<T> Mode<T>(
    this IEnumerable<T> source,
    IEqualityComparer<T> comparer = null)
{
    var mode = source.GroupBy(
            t => t,
            (t, s) => new { Value = t, Count = s.Count() }, comparer)
        .GroupBy(f => f.Count)
        .OrderbyDescending(g => g.Key).FirstOrDefault();

    return mode == null ? Enumerable.Empty<T>() : mode.Select(g => g.Value);
}

which obviously could be used thus,

var mostFrequent = string.Empty.Mode();

var mostFrequent = "abbbbccd".Mode();

var mostFrequent = "aaacbbbcdddceee".Mode();

here, mostFrequent is an IEnumerable<char>.


Find the simplest and without built in function used

sample code and links

public char MostOccurringCharInString(string charString)
{
int mostOccurrence = -1;
char mostOccurringChar = ' ';
foreach (char currentChar  in charString)
{
    int foundCharOccreence = 0;
    foreach (char charToBeMatch in charString)
    {
        if (currentChar == charToBeMatch)
            foundCharOccreence++;
    }
    if (mostOccurrence < foundCharOccreence)
    {
        mostOccurrence = foundCharOccreence;
        mostOccurringChar = currentChar;
    }
 }
  return mostOccurringChar;
}

Know more about how to get max occurrence and what is the flow.

How to get max occurred character and max occurrence in string


#simplified expression using LINQ#
string text = "abccdeeef";
int length = text.ToCharArray().GroupBy(x => x).OrderByDescending(x => 
x.Count()).First().Count();


There are many different ways to solve the problem.

  1. Linq
  2. Dictionary
  3. Using system.

You can choose based on your preferences. Listing one of it.

  private static void CalculateMaxCharCountUsingArray(string actualString)
        {
            char[] charArray = actualString.ToCharArray();

            int[] arr = new int[256];
            int maxCount = 0;
            char maxChar = ' ';
            foreach (var r in charArray)
            {
                arr[r] = arr[r] + 1;
                if (maxCount < arr[r])
                {
                    maxCount = arr[r];
                    maxChar = r;
                }
            }
            Console.WriteLine("This character " + maxChar + " that appeared maximum times : " + maxCount);

            IEnumerable<char> distinctCharArray = charArray.Distinct();

            foreach(var r in distinctCharArray)
            {
                Console.WriteLine("This character " + r + " that appeared  times " + arr[r]  + " in a string");

            }
        }

I learned all of them from the below link for your reference.


Code:

class CharCount
{
    public void CountCharacter()
    {
        int n;
        Console.WriteLine("enter the no. of elements: ");
        n = Convert.ToInt32(Console.ReadLine());

        char[] chararr = new char[n];
        Console.WriteLine("enter the elements in array: ");
        for (int i = 0; i < n; i++)
        {
            chararr[i] = Convert.ToChar(Console.ReadLine());
        }
        Dictionary<char, int> count = chararr.GroupBy(x => x).ToDictionary(g => g.Key, g => g.Count());

        foreach(KeyValuePair<char, int> key in count)
        {
            Console.WriteLine("Occurrence of {0}: {1}",key.Key,key.Value);
        }

        Console.ReadLine();
    }
}


        //find most occuring character and count from below string

        string totest = "abcda12Zernn111y";

        string maxOccuringCharacter = "";
        int maxOccurence = 0;string currentLoopCharacter = ""; string updatedStringToTest = "";int cnt = 0;

        for (int i = 0; i < totest.Length; i++)
        {
            currentLoopCharacter = totest[i].ToString();
            updatedStringToTest = totest.Replace(currentLoopCharacter, "");

            cnt = totest.Length - updatedStringToTest.Length;

            if (cnt > maxOccurence)
            {
                maxOccuringCharacter = currentLoopCharacter;
                maxOccurence = cnt;
            }

            totest = updatedStringToTest;
        }

        Console.WriteLine("The most occuring character is {0} and occurence was {1}", maxOccuringCharacter, maxOccurence.ToString());
        Console.ReadLine();


This is Femaref's solution modified to return multiple letters if their Count matches. Its no longer a one-liner but still reasonably concise and should be fairly performant.

    public static IEnumerable<char> GetMostFrequentCharacters(this string str)
    {
        if (string.IsNullOrEmpty(str))
            return Enumerable.Empty<char>();

        var groups = str.GroupBy(x => x).Select(x => new { Letter = x.Key, Count = x.Count() }).ToList();
        var max = groups.Max(g2 => g2.Count);
        return groups.Where(g => g.Count == max).Select(g => g.Letter);
    }


A different approach using LINQ and a Dictionary data structure as a lookup list:

        var str = "abbbbccd";
        var chrArr = str.ToCharArray();
        Dictionary<char, int> dic = new Dictionary<char, int>();
        foreach (char a in chrArr)
        {
            if (dic.ContainsKey(a))
                dic[a]++;
            else
                dic.Add(a, 1);
        }
        int count = dic.Values.Max();
        char val = dic.Where(d => d.Value == count).FirstOrDefault().Key;
0

精彩评论

暂无评论...
验证码 换一张
取 消