I have had this problem for a while, still trying to work on a solution.
What is the best possible method for evenly distributing item开发者_StackOverflow中文版s in a list, with a low discrepancy?
Say we have a list or elements in an array:
Red, Red, Red, Red, Red, Blue, Blue, Blue, Green, Green, Green, Yellow
So ideally the output would produce something like:
Red, Blue, Red, Green, Red, Yellow, Blue, Red, Green, Red, Blue, Green.
Where each instance is "as far away" from another instance of itself as possible...
When, I first attempted trying to solve this problem, I must admit I was naive, so I just used some form of seeded random number to shuffle the list, but this leads to clumping of instances.
A suggestion was start with the item with the highest frequency, so red will be put in position n*12/5 for n from 0 to 4 inclusive.
Then place the next most repeated element (Blue) in positions n*12/3 + 1 for n from 0 to 2 inclusive. If something is already placed there, just put it in the next empty spot. etc etc. However, when jotting it out on paper this doesn't work in all circumstances,
Say the list is only
Red, Red, Red, Red, Red, Blue
It will fail.
Where either option has three same-color adjacencies
Red, Red, Blue, Red, Red, Red
Red, Red, Red, Blue, Red, Red
So please, any ideas, or implementations how to do this would be awesome.
If it matters i'm working on objective-c, but right now all I care about is the methodology how to do it.
You can generate a series of rational numbers indicating even spacing for each colour. Then, sort all of those numbers.
Example:
- 5 ×
B
: the numbers are (1/10 3/10 5/10 7/10 9/10) - 3 ×
R
: the numbers are (1/6 3/6 5/6) - sorted: ((1/10 "B") (1/6 "R") (3/10 "B") (5/10 "B") (3/6 "R") (7/10 "B") (5/6 "R") (9/10 "B"))
- => B R B B R B R B
When the numbers are equal, apply a secondary sorting, which can be arbitrary, but should be consistent.
Note that the individual sequences are already sorted, so you can sort by merging, which is just O(n·log m) in this case (n being the sum of all counts, m the number of colours). This can be further optimized by generating the numbers lazily.
The final algorithm does not need the explicit sorting:
- set the
B
counter to (/ (* 2 5)) => 1/10 - set the
R
counter to (/ (* 2 3)) => 1/6 - set the
B
step to double theB
counter - set the
R
step to double theR
counter - loop
- take one of the colour with the lowest counter and put it into your result
- step that counter by its step width
- until all counters are >= 1
Since you need n steps of the loop, and each time have to find the minimum of m numbers, this seems to run on O(n·m). However, you can keep the counters in a minimum heap to bring it down to O(n·log m) again.
Just a quick idea: Use separate list for each type of item. Then using something like a merge sort insert one item from each list into a new list, always in the same order. Skip empty lists.
This of course does not yield the perfect solution, but it is very easy to implement and should be fast. A simple improvement is to sort the list by size, largest first. This gives slightly better results than a random order of lists.
Update: perhaps this could make it better: get the size of the largest list at algorithm start and call it LARGEST_SIZE
- this one will get its turn in each round. Now for all other lists, they should be used only in starting_size_of_the_list/LARGEST_SIZE
rounds. I hope you know what i mean. This way you should be able to evenly space all the items. But nevertheless, it still is not perfect!
OK so i will try to be more specific. Say you have 4 lists of sizes: 30 15 6 3
For the first list, you will use it every 30/30 round, which is 1, so every 1 round. This means each time. For the second list, you will use it 15/30 which is 0.5 so every 2 round. third list: 6/30 -> every 5 rounds. Last list: 3/30 -> every 10 rounds. This should really give you a nice spacing of items.
This is of course a nice example, for other numbers it gets a bit uglier. For very small amounts of items this wont get you perfect results. However for large amount of items it should work quite nice.
I'll post here the solution that i've used it in a few cases for this problem in algorithm contests.
You'll have a max heap of pairs(COUNTER, COLOUR), order by COUNTER, so the colour with the biggest COUNTER will be on the top. Each time you'll have two cases: if the value in the top it's not equal with the last element in the list, you'll remove the pair(COUNTERx, COLOURx) from the heap, add COLOURx to the end of the list, and add pair( (COUNTERx) - 1, COLOURx) to the heap if (COUNTERx) - 1 != 0. In the other case take the second greatest COUNTER pair from the heap instead of first and do the same like for the first pair. The time complexity is o(S log N), where N is the number of colours and S the size of the list.
You could do an inverse of K-means clustering, aiming to either:
- maximise the number of clusters
- define the proximity of the items to similar items using using some sort of inverse function so that clusters are created from similar items that are further apart rather than close together.
I think you'd need to optimize for some kind of an improvement function - say calculate how much "better" it will be to insert Blue at a certain position and do that for all possible insert positions and then insert to any location with the maximum value of this "gain" function and continue.
Sort the list using a dynamic score function, that for each element in the list returns the distance from the closest element with the same value.
精彩评论