Why does using Random in Sort causing [Unable to sort IComparer.Compare error]_问答_开发者

I tried shuffling a list of byte (List) using either code:

myList.Sort((a, b) => this._Rnd.Next(-1, 1));

myList.Sort(delegate(byte b1, byte b2)
{
    return this._Rnd.Next(-1, 1);
});

and they threw the following error:

Unable to sort because the IComparer.Compare() method returns inconsistent results. Either a value does not compare equal to itself, or one value repeatedly compared to another value yields different results. x: '{0}', x's type: '{1}', IComparer: '{2}'.

What is wrong with using a random rather than the compare function of byte?

I tried using LINQ function instead and it works.

var myNewList = myList.OrderBy(s => Guid.NewGuid());
var myNewList = myList.OrderBy(s => this._Rnd.NextDouble());

I did read that these methods are slower than Fisher开发者_StackOverflow社区–Yates shuffle giving O(n) runtime only. But was just wondering on using the Sort function and random.

Not only is the comparison relation required to be consistent, it is also required to impose a total ordering. For example, you cannot say "socks are smaller than shoes, shirts are neither smaller than nor greater than trousers" blah blah blah, feed that into a sort algorithm and expect to get a topological sort out of the other end. Comparison sorts are called comparison sorts because they require a well-formed comparison relation. In particular, quicksort can run forever or give nonsensical results if the comparison relation is not consistent, transitive, and total-ordering.

If what you want is a shuffle then implement a Fischer-Yates shuffle. (Do it correctly; even though the algorithm is trivial, it is almost always implemented wrong.) If what you want is a topological sort then implement a topological sort. Use the right tool for the job.

Because as the error says, Random is not consistent. You must have a comparer that always returns the same result when given the same parameters. otherwise the sort will not be consistent.

Knuth has a random sort algorithm which worked like an insertion sort, but you swapped the value with a randomly chosen location in hhe existing array.

Sorting algorithms generally work by defining a comparison function. The algorithms will repeatedly compare two items in the sequence to be sorted and swap them if their current order doesn't match the desired order. The differences between algorithms have mainly to do with finding the most efficient way possible in the given circumstances to do all the compares.

In the process of making all these compares, it's common for the same two elements in a sequence to need to be compared more than once! Using non-numeric data to make this easier, let's say you have items with values "Red" and "Apple". The random comparer selects "Apple" as the greater item on the first comparison. Later on, if the random comparer selects "Red" as the greater item, and this back and forth continues, you can end up in a situation where the algorithm never finishes.

Mostly you get lucky, and nothing happens. But sometimes you don't. .Net is pretty good about not just running forever and guards against this, but it does (and should!) throw an exception when these guards detect inconsistent ordering.

Of course, the correct way to handle this in the general case is via a Knuth-Fisher-Yates shuffle.

It's further worth mentioning that there are times when a simple Fisher-Yates is not appropriate. One example is needing to randomize a sequence of unknown length... say you're wanting to randomly rearrange data received from a network stream, without knowing how much data is in the stream, and feed that data as quickly as possible to a worker thread elsewhere.

In this situation you can't perfectly randomize that data. Without knowing the length of the stream you don't have enough information to correctly do the shuffle, and even if you did you might find the length is so long as to make holding it all in RAM or even on disk impractical. Or you might find that the stream won't complete for hours, but your worker thread needs to get going much sooner. In this case, you'd likely settle for (and understanding that this is "settling" is important) an algorithm that loads a buffer of adequate length, randomizes the buffer, feeds out about half the buffer to the worker thread, and then re-fills the empty portion of the buffer to repeat the process. Even here, you're likely using Knuth-Fisher-Yates for the step that randomizes the buffer.