开发者

Find word with maximum number of occurrences

开发者 https://www.devze.com 2023-04-07 00:12 出处:网络
What is the most optimal way (algorithm) to search for the word that has the maximum number of occurrence开发者_运维知识库s in a document?Finding the word that occures most times in a document can be

What is the most optimal way (algorithm) to search for the word that has the maximum number of occurrence开发者_运维知识库s in a document?


Finding the word that occures most times in a document can be done in O(n) by a simple histogram [hash based]:

histogram <- new map<String,int>
for each word in document: 
   if word in histogram:
      histogram[word] <- histogram[word] + 1
   else:
      histogram[word] <- 1
max <- 0
maxWord<- ""
for each word in histogram:
  if histogram[word] > max:
     max <- histogram[word]
     maxWord <- word
return maxWord

This is O(n) solution, and since the problem is clearly Omega(n) problem, it is optimal in terms of big O notation.


  1. Scan the document once, keeping a count of how many times you have seen every unique word (perhaps using a hashtable or a tree to do this).
  2. While performing step 1, keep track of the word that has the highest count of all words seen so far.
0

精彩评论

暂无评论...
验证码 换一张
取 消