Calculating complexity of algorithm_问答_开发者

开发者 https://www.devze.com 2023-03-16 06:25 出处：网络

I am not sure if this is a valid question. Let\'s assume there is an O(n^3) algorithm which sorts 100 numbers per day with a computing power x.

I am not sure if this is a valid question.

Let's assume there is an O(n^3) algorithm which sorts 100 numbers per day with a computing power x.

How many numbers will it be able to sort if the computing power is doubled ie., 2x .

I understand that i din't define 'computing power'. Please correct the question if it is ambiguous开发者_Python百科 or wrong.

Big O notation does not give you eact time (real computer time). It is a method that try to understand effiency of algorithms independent of cpu or other specific computer features.

In mathamatics perpective, you can say that O(n^2) is more efficient than O(n^3). But in an engineer view point, for some values of n , O(n^3) can be better than O(n^2). This methods just says that for suffieciently large n, O(n^2) will be more efficient than O(n^3).

There is a good introduction to algorithm analsysis in you tube. First chapter is good for explaning those things: MIT 6.046J / 18.410J Introduction to Algorithms

The Big O notation can only say that for the same machine, for sufficiently large n, O(n^2) is better than O(n^3). But for the same algorithm i think you can not make direct propotion betweeen computer power and output. If we double the computer power, the output doubled? Maybe yes but may be not. I think we can not say such a thing by just looking algorithm Big O.

Calculating complexity of algorithm

N = 100 may be not big enough to be able to assume that the algorithm has already reached its asymptotic behaviour in terms of CPU usage, and so using the big O expression to extrapolate CPU usage for bigger datasets may generate erroneous results.

Though, you can try to determine if you are already on the asymptotic zone, measuring the CPU usage for datasets of other sizes (for instance, 50, 80, 120, etc.) and seeing how they fit on a t = C*N^3 curve, being C a constant to be determined.

If the fit is "good enough" after some given N, you can use the same curve to make predictions of CPU usage for bigger datasets.

In any case, you should consider those predictions as the lower limit of the CPU time required to compute some dataset because in practice, other computing resources as memory, caches or IO can become the bottleneck over the CPU usage.

Update: in response to Vapuru post and its comments:

For a typical O(N^3) algorithm the number of operations performed by a specific implementation is t = a + b * N + c * N^2 + d * N^3 (there may be other components, as log N, but as you will see, it doesn't actually matter). So, for two given data sets of size N1 and N2 you can calculate t1 and t2 respectively.

For the given OP problem, t2 / t1 = 2, so (a + b * N2 + c * N2^2 + d * N2^3) / (a + b * N1 + c * N1^2 + d * N1^3) = 2 and for N1 and N2 big enough that expression can be approximated as (d * N2^3)/(d * N1^3) = 2 that can be further simplified as (N2/N1)^3 = 2 and so N2 = N1 * 2^(1/3).

That means that for big datasets, doubling the processing time (or doubling the processing speed and keeping time unchanged), allows to increase the size of the dataset approximately by a factor of 2^(1/3), that's 1.26 or a 26%.

In practice, other factors besides CPU speed may affect performance, so actually it should be read as N2/N1 < 1.26.