开发者

What is the best sort algorithm for a random set of floats?

开发者 https://www.devze.com 2023-01-07 03:46 出处:网络
A colleague of mine just put that question out in the air this afternoon, and somewhat left me curious. I\'m versed with sorting algos, but lacking a formal degree in compsci / compeng (something I\'m

A colleague of mine just put that question out in the air this afternoon, and somewhat left me curious. I'm versed with sorting algos, but lacking a formal degree in compsci / compeng (something I'm sorta averse to admitting), can't really place my finger on this one. :p

And oh yeah, this is mildly in the context 开发者_开发百科of a C#/.NET implementation... just in case that changes things a bit.

Thanks guys. :)


For fixed-length numbers, you're not restricted to comparison-based sorting algorithms, so O(n*log(n)) is not the limit. Radix Sort works in O(n), and can be used quite conveniently due to the amazing property of IEEE 754 floats of being sorted correctly when their bit pattern is interpreted as integers.


I see that no one has mentioned introsort, which solves quick sort's O(n^2) worst case by switching to heapsort when the recursion depth exceeds a certain threshold. This means that quick sort will not get the chance to degenerate, since its number of recursive calls will definitely be limited.

Another optimization is to switch to insertion sort whenever the number of elements of the sequence you are currently at is small (say 16).

This is how introsort could look:

void Introsort(int A[], int N, int left, int right, int depth)
{
    if ( left < right ) // note: this doesn't switch to insertion sort if right - left is small enough
    {   
        if ( (1 << depth) > N )
            Heapsort(A, left, right);
        else
        {
            int P = Partition(A, left, right);
            Introsort(A, N, left, P, depth+1);
            Introsort(A, N, P+1, right, depth+1);
        }
    }
}

This, combined with a good partition function (simply randomly choosing the pivot should be good enough for most purposes), will give you a very fast sorting algorithm.

There is also the choice of radix sort, which works really well, especially if your floats are not too big. From what I've seen though, it takes millions of elements for radix sort to outperform introsort.


If you want a visual representetion on sorting-algorithms, check out this fantastic site:

Sorting-algorithms.com

You'll get the feeling which works best in different cases, but my favorite is the merge sort, even though its not much better than quick sort.


Theoretically you compare algorithms using the big O notation, which lets you compare which algorithm will be faster for "almost infinite" problem. In practise in most cases this is a very good staring point to compare how the algorithms will behave in real life.

The two most popular fast sort algorithms are MergeSort and quick sort. Merge Sort is guaranteed to be O(n log n) for any data, whereas quick sort has average time of O(n log n) and pessimistic time O(n^2). In practise most people use quick sort, because:

  1. It happens almost in place naturally (I think you can make merge sort in place but it is tedious and will make it slower - it will increase the constant hidden in the O notation) - for large data sets this is an issue if the data does not fit into memory
  2. It is faster in practise in most cases
  3. You can modify it slightly (i.e. take the median of first, middle and last element for partitioning) so that it is very difficult to get data that would make it slow

To sum up I think quick sort will be quicker for your random floats, even though only looking at the O notation it seems worse - because you will get the expected O(n log n) and it will have a smaller constant than merge sort.


A minor point to be aware of is that if any of your set are nan then the set isn't ordered and some sorting algorithms might give unexpected results or even crash. I thinks its best to ensure that none of your numbers are nan before sorting.

For example (using gcc 3.4.6) applying qsort (ascending) to { 2, 1, nan, -1} gives {1, 2, nan, -1}.

On the other hand inf and -inf are not a problem.

0

精彩评论

暂无评论...
验证码 换一张
取 消