Note: Please do not tag this as homework! I am not a student and this is not an assignment. I am a software engineer dusting off my old Data Structures & Algorithms textbook and trying to remember stuff I learned years ago and that I cannot seem to find anywhere online.
I recall a specific lecture, and it is reinforced in my textbook, that algorithm bounds (upper, tight and lower) and cases (best, average and worst) are开发者_开发技巧 not one in the same. But for the life of me I can't remember how the two concepts are different.
To me, if some algorithm is O(n) worst-case, then it can perform any worse than some linear function, such as f(n) = cn + k
. Since we are guaranteed this in the worst case, it seems to me that its upper-bound is also linear.
I know I'm wrong, I just can't figure out why.
I'm a contextual learner, so if someone could provide a meaningful example where either worst-case is not upper bound, or best case is not lower bound, or where average case is not tight-bound, that would probably get through to me the fastest.
Thanks for any clarity on this!
Consider a variant of quicksort that checks if the array is already sorted and, if not, sorts using the first element as a pivot.
The best case is O(n) if the array is already sorted. This is an upper bound on the best-case behavior, which is meaningful if not interesting.
The average case over random inputs is O(n3/2) and Ω(n). Okay, I cheated slighty because it's also Theta(n log n), but the idea is that bounds are not always tight, and this lack of tightness may manifest itself when I'm describing average-case bounds.
The worst case is Theta(n2) if the array is reverse sorted, because the subproblems are so unbalanced (each time, we pivot with the maximum element, resulting in subproblems of sizes 0 and n - 1 instead of about n/2 and about n/2). This is a tight bound on the worst case and shows that the algorithm really can be that bad, but no worse. Quicksort is also O(n3), but not Theta(n3), because quicksort has no family of inputs that results in cubic behavior.
The case refers to the type of running time being investigated, whereas the bound refers to a function for that running time.
Saying that an algorithm has worst-case running time of O(f(n)) is equivalent to saying that f(n) is an asymptotic upper bound for the worst-case running time of the algorithm.
The 3 cases (best-case, average-case, and worst-case) and 5 asymptotic bounds (upper (O), tight upper (o), lower (Ω), tight lower (ω), and tight (Θ)) give 15 distinct ways of making a statement about the running time.
Confusion or disaster can arise if the bound is specified without the case. For example the lower bound of a sorting algorithm can be both Θ(n) and Θ(n lg n), depending on whether we are talking about best-case or worst-case. A Θ(1) average case running time is no good if the Θ(n3) worst-case running time makes the factory grind to a halt.
The naive algorithm for a sum, for example, is in O(n), O(n^2), O(n^3) and all higher classes when O denotes the upper bound of the worst case. It is also in Theta(n), but not in Theta(n^2) or Theta(n^0), when Theta is the strict bound. It is in Omega(n), Omega(n^0) and all lower classes when Omega is the lower bound.
This is distinct to worst/best/average case, which are all equal for the sum function. Worst/best case just determines the optimism on the input.
Description :
Above extract from here.For an example : Let's say we have an algorithm that needs to compare every element in an array to every other element in the array. A simple implementation may look like:
for my $i (0 .. $#array) {
for my $j (0 .. $#array) {
next if $j == $i;
# Compare $i, $j
}
}
This is O(N^2). After a little bit of testing we decide that this is far too slow, so we make a little optimization.
for my $i (0 .. $#array - 1) {
for my $j ($i + 1 .. $#array) {
# Compare $i, $j
}
}
We have just cut our run time in half - YAY! Guess what, the Big-O has stayed the same O(N^2). This is because N^2 / 2 only has one variable part.
This part was from here.
Consider a function which takes a list of N+1 integers. Let's say that if the first element is 0, the function calls either an Th(log n) or an Th(n) function to process the data, and the determination occurs uniformly at random. Similarly, if the first element is 1, call an Th(n^2) to process the data. For all other values of the first element,, call either a Th(n^1.5) or a Th(n log n) function. We can say the following about this function's complexity:
- Best case: Omega(log n), Omicron(n)
- Average case*: Omega(n log n), Omicron(n^1.5)
Worst case: Omega/Omicron/Theta(n^2)
- Assuming that the number of possible values for the first element is sufficiently large.
Also, bear in mind that complexity deals with an algorithm's implementation; you can always increase the time complexity of an algorithm by adding dummy loops, whereas complexity/computability theory studies the minimal complexity of algorithms to solve computational problems. Subtle distinction.
Talking about a case means you are considering the performance of the algorithm over some specific input. Worst case means "for what we are about to discuss, consider running everything over all possible inputs and take the statement to the apply to the slowest run", and so on.
A bound is an inequality, a statement that something is less than or greater than something else. A bound on the complexity says that whatever its complexity function is, it's greater or less than the thing specified. Big-O, little-o, Omega, are all notations telling you about bounds. If something is O(n), that means the runs of the algorithm described, over input satisfying parameter n, will never be worse than some linear function, asymptotically (or, in even more excruciating detail, that there exists some N and c, st for all n>N, cn beats the running time of the algorithm).
That's not so bad: O, o, Omega, just tell you about the bounds on the problem. You can then talk about the bounds on the best case, or the worst case, and so on. For example, most sorting algorithms will be best-case O(n). You can put bounds on the worst case, which are clearly also bounds on all cases. Theta is a special notation: Theta(f) means O(f) and Omega(f), which means that the bound is tight, and can't be improved. It's an exact statement.
An algorithm that performs much better on some input that others won't have a Theta bound, but you'd always like to find one when you're talking about a specific input. It's usually not too hard to strengthen statements about O bounds into a Theta statement about the algorithm restricted to its worst case input, for example.
So, every algorithm has a complexity, you just might struggle to identify or prove it. You want to get the smallest upper bounds possible (push down O) and biggest lower bounds. Proving things about specific cases of the algorithm, or averaging out over all inputs, often gives you interesting and useful information too.
精彩评论