Probabilistic selection algorithm_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-02-11 20:00 出处：网络

Given is: An array of length N. The array contains integers. The integers are not necessarily sorted. Find an algorithm that:

相关专题：algorithm

Given is:

An array of length N.
The array contains integers.
The integers are not necessarily sorted.

Find an algorithm that:

Returns (a close approximation of)the K-th smallest array element.
Has a runtime complexity of O(N log N) and a space complexity of O(log N).
The algorithm needn't be deterministic. In case of a probabilistic algorithm also provide a measure for the qu开发者_StackOverflow中文版ality of the approximated result.

Treat the problem as something analogous to Quicksort. Given an element in the array, you can get its rank in O(n) time and O(lg n) space. You can use binary search to find an element with a given rank in O(lg n) iterations of that, for a total of O(lg n) space and O(n lg n) time.

Iterate over the array once to find the minimum and maximum element.
Iterate over the array to find a random element pivot between minimum and maximum (exclusive).
Iterate over the array and count the number of elements less than or equal to pivot (numSmallerEqual) and the number of elements greater than pivot (numBigger).
1. If K <= numSmallerEqual, set maximum=pivot.
2. Else set minimum=pivot.
If (maximum - minimum)==0, output minimum, terminate.
If (maximum - minimum)==1
1. If K <= numSmallerEqual, output minimum.
2. Else output maximum.
3. Terminate.
GOTO 2:

EDIT: Corrected the error pointed out by lVlad, still not tested.

Don't construct partitions. Describe what the partitions are (in constant space), and recursively select on this.

Each subarray that quickselect recurses into can be described by its bounds (min and max element values, not their indexes). Iterating over a subarray so described requires O(n) comparisons, which are made at every recursion level, up to the same depth as in quicksort: O(log n) in the average case.

Quicksort also makes O(n) comparisons at every recursion level, while ordinary permuting quickselect makes O(n) comparisons in total in the average case (because it always recurses into only one partition).

Here's a sample implementation for distinct elements with an ordinary quickselect implementation for comparison.

You can take the parameterization approach:

Since we have k specified in the problem, we can treat it as a constant, thus space of O(k log N) should be acceptable.

Divide the array into k partitions of equal length (O(1) time and O(k log N) space to store the partition borders because each key should only require log N space)
Find the smallest element in each partition (O(N) time and O(k log N) space to store the smallest elements
return the max of the k elements you have (O(k) time)

Since there's room for more cycles in there, there's probably a better way to do it. Also note that this would perform very poorly on an array that is sorted...