I need to write a program to find the mode. Or the most occurrence of an integer or integers. So,
1,2,3,4,1,10,4,23,12,4,1 would have mode of 1 and 4.
I'm not really sure what kind of algorithm i should use. I'm having a hard time trying to think of something that would work.
I was thinking of a frequency table of some sort maybe where i could go through array and then go through and create a linked list maybe. If the linked doesn't contain that value add it to the linked, if it d开发者_如何学Pythonoes then add 1 to the value.
So if i had the same thing from above. loop through 1,2,3,4,1,10,4,23,12,4,1
Then list is empty so add node with number = 1 and value = 1. 2 doesnt exist so add node with number = 2 and value = 1 and so on. Get to the 1 and 1 already exists so value = 2 now.
I would have to loop through the array and then loop through linked list everytime to find that value.
Once i am done then go through the linked list and create a new linked list that will hold the modes. So i set the head to the first element which is 1. Then i go through the linked list that contains the occurences and compare the values. If the occurences of the current node is > the current highest then i set the head to this node. If its = to the highest then i add the node to the mode linked list.
Once i am done i loop through the mode list and print the values.
Not sure if this would work. Does anyone see anything wrong with this? Is there an easier way to do this? I was thinking a hash table too, but not really sure how to do that in C.
Thanks.
If you can keep the entire list of integers in memory, you could sort the list first, which will make repeated values adjacent to each other. Then you can do a single pass over the sorted list to look for the mode. That way, you only need to keep track of the best candidate(s) for the mode seen up until now, along with how many times the current value has been seen so far.
The algorithm you have is fine for a homework assignment. There are all sorts of things you could do to optimise the code, such as:
- use a binary tree for efficiency,
- use an array of counts where the index is the number (assuming the number range is limited).
But I think you'll find they're not necessary in this case. For homework, the intent is just to show that you understand how to program, not that you know all sorts of tricks for wringing out the last ounce of performance. Your educator will be looking far more for readable, structured, code than tricky optimisations.
I'll describe below what I'd do. You're obviously free to use my advice as much or as little as you wish, depending on how much satisfaction you want to gain at doing it yourself. I'll provide pseudo-code only, which is my standard practice for homework questions.
I would start with a structure holding a number, a count and next pointer (for your linked list) and the global pointer to the first one:
typedef struct sElement {
int number;
int count;
struct sElement *next;
} tElement;
tElement first = NULL;
Then create some functions for creating and using the list:
tElement *incrementElement (int number);
tElement *getMaxCountElement (void);
tElement *getNextMatching (tElement *ptr, int count);
Those functions will, respectively:
- Increment the count for an element (or create it and set count to 1).
- Scan all the elements returning the maximum count.
- Get the next element pointer matching the count, starting at a given point, or NULL if no more.
The pseudo-code for each:
def incrementElement (number):
# Find matching number in list or NULL.
set ptr to first
while ptr is not NULL:
if ptr->number is equal to number:
return ptr
set ptr to ptr->next
# If not found, add one at start with zero count.
if ptr is NULL:
set ptr to newly allocated element
set ptr->number to number
set ptr->count to 0
set ptr->next to first
set first to ptr
# Increment count.
set ptr->count to ptr->count + 1
def getMaxCountElement (number):
# List empty, no mode.
if first is NULL:
return NULL
# Assume first element is mode to start with.
set retptr to first
# Process all other elements.
set ptr to first->next
while ptr is not NULL:
# Save new mode if you find one.
if ptr->count is greater than retptr->count:
set retptr to ptr
set ptr to ptr->next
# Return actual mode element pointer.
return retptr
def getNextMatching (ptr, number):
# Process all elements.
while ptr is not NULL:
# If match on count, return it.
if ptr->number is equal to number:
return ptr
set ptr to ptr->next
# Went through whole list with no match, return NULL.
return NULL
Then your main program becomes:
# Process all the numbers, adding to (or incrementing in) list .
for each n in numbers to process:
incrementElement (n)
# Get the mode quantity, only look for modes if list was non-empty.
maxElem = getMaxCountElement ()
if maxElem is not NULL:
# Find the first one, whil exists, print and find the next one.
ptr = getNextMatching (first, maxElem->count)
while ptr is not NULL:
print ptr->number
ptr = getNextMatching (ptr->next, maxElem->count)
If the range of numbers is known in advance, and is a reasonable number, you can allocate a sufficiently large array for the counters and just do count[i] += 1
.
If the range of numbers is not known in advance, or is too large for the naive use of an array, you could instead maintain a binary tree of values to maintain your counters. This will give you far less searching than a linked list would. Either way you'd have to traverse the array or tree and build an ordering of highest to lowest counts. Again I'd recommend a tree for that, but your list solution could work as well.
Another interesting option could be the use of a priority queue for your extraction phase. Once you have your list of counters completed, walk your tree and insert each value at a priority equal to its count. Then you just pull values from the priority queue until the count goes down.
I would go for a simple hash table based solution.
A structure for hash table containing a number and corresponding frequency. Plus a pointer to the next element for chaining in the hash bucket.
struct ItemFreq {
struct ItemFreq * next_;
int number_;
int frequency_;
};
The processing starts with
max_freq_so_far = 0;
It goes through the list of numbers. For each number
, the hash table is looked up for a ItemFreq
element x
such that x.number_ == number
.
- If no such
x
is found, then aItemFreq
element is created as{ number_ = number, frequency_ = 1}
and inserted into the hash table. - If some
x
was found then itsfrequency_
is incremented. - If
frequency_ > max_freq_so_far
thenmax_freq_so_far = frequency
Once traversing through the list of numbers of complete, we traverse through the hash table and print the ItemFreq
items whose frequency_ == max_freq_so_far
The complexity of the algorithm is O(N)
where N
is the number of items in the input list.
For a simple and elegant construction of hash table, see section 6.6 of K&R (The C Programming Language).
This response is a sample for the idea of Paul Kuliniewicz:
int CompInt(const void* ptr1, const void* ptr2) {
const int a = *(int*)ptr1;
const int b = *(int*)ptr2;
if (a < b) return -1;
if (a > b) return +1;
return 0;
}
// This function leave the modes in output and return the number
// of modes in output. The output pointer should be available to
// hold at least n integers.
int GetModes(const int* v, int n, int* output) {
// Sort the data and initialize the best result.
qsort(v, v + n, CompInt);
int outputSize = 0;
// Loop through elements while there are not exhausted.
// (look there is no ++i after each iteration).
for (int i = 0; i < n;) {
// This is the begin of the new group.
const int begin = i;
// Move the pointer until there are no more equal elements.
for (; i < n && v[i] == v[begin]; ++i);
// This is one-past the last element in the current group.
const int end = i;
// Update the best mode found until now.
if (end - begin > best) {
best = end - begin;
outputSize = 0;
}
if (end - begin == best)
output[outputSize++] = v[begin];
}
return outputSize;
}
精彩评论