I have a population of 50 ordered integers (1,2,3,..,50) and I look for a generic way to slice it "n" ways ("n" is the number of cutoff points ranging from 1 to 25) that maintains the order of the elements.
For example, for n=1 (one cutoff point) there are 49 possible grouping alternatives ([1,2-49], [1-2,3-50], [1-3,4-50],...). For n=2 (two cutoff points), the grouping alternatives are like: [1,2,3-50], [1,2-3,4-50],...
Could you recommend any general-purpose algorithm to complete this task in an efficient way?
Thanks, Chris
Thanks everyone for your feedback. I reviewed all your comments and I am working on a generic solution that will return all combinations (e.g., [1,2,3-50], [1,2-3,4-50],...) for all n开发者_StackOverflow社区umbers of cutoff points.
Thanks again, Chris
Let sequence length be N
, and number of slices n
.
That problem becomes easier when you notice that, choosing a slicing to n slices is equivalent to choosing n - 1
from N - 1
possible split points (a split point is between every two numbers in the sequence). Hence there is (N - 1 choose n - 1) such slicings.
To generate all slicings (to n slices), you have to generate all n - 1
element subsets of numbers from 1 to N - 1
.
The exact algorithm for this problem is placed here: How to iteratively generate k elements subsets from a set of size n in java?
Do you need the cutoffs, or are you just counting them. If you're just going to count them, then it's simple:
1 cutoff = (n-1) options
2 cutoffs = (n-1)*(n-2)/2 options
3 cutoffs = (n-1)(n-2)(n-3)/4 options
you can see the patterns here
If you actually need the cutoffs, then you have to actually do the loops, but since n is so small, Emilio is right, just brute force it.
1 cutoff
for(i=1,i<n;++i)
cout << i;
2 cutoffs
for(i=1;<i<n;++i)
for(j=i+1,j<n;++j)
cout << i << " " << j;
3 cutoffs
for(i=1;<i<n;++i)
for(j=i+1,j<n;++j)
for(k=j+1,k<n;++k)
cout << i << " " << j << " " << k;
again, you can see the pattern
So you want to select 25 split point from 49 choices in all possible ways. There are a lot of well known algorithms to do that.
I want to draw your attention to another side of this problem. There are 49!/(25!*(49-25)!) = 63 205 303 218 876 >= 2^45 ~= 10^13 different combinations. So if you want to store it, the required amount of memory is 32TB * sizeof(Combination). I guess that it will pass 1 PB mark.
Now lets assume that you want to process generated data on the fly. Lets make rather optimistic assumption that you can process 1 million combinations per second (here i assume that there is no parallelization). So this task will take 10^7 seconds = 2777 hours = 115 days.
This problem is more complicated than it seems at first glance. If you want to solve if at home in reasonable time, my suggestion is to change the strategy or wait for the advance of quantum computers.
This will generate an array of all the ranges, but I warn you, it'll take tons of memory, due to the large numbers of results (50 elements with 3 splits is 49*48*47=110544) I haven't even tried to compile it, so there's probably errors, but this is the general algorithm I'd use.
typedef std::vector<int>::iterator iterator_t;
typedef std::pair<iterator_t, iterator_t> range_t;
typedef std::vector<range_t> answer_t;
answer_t F(std::vector<int> integers, int slices) {
answer_t prev; //things to slice more
answer_t results; //thin
//initialize results for 0 slices
results.push_back(answer(range(integers.begin(), integers.end()), 1));
//while there's still more slicing to do
while(slices--) {
//move "results" to the "things to slice" pile
prev.clear();
prev.swap(results);
//for each thing to slice
for(int group=0; group<prev.size(); ++group) {
//for each range
for(int crange=0; crange<prev[group].size(); ++crange) {
//for each place in that range
for(int newsplit=0; newsplit<prev[group][crange].size(); ++newsplit) {
//copy the "result"
answer_t cur = prev[group];
//slice it
range_t L = range(cur[crange].first, cur[crange].first+newsplit);
range_t R = range(cur[crange].first+newsplit), cur[crange].second);
answer_t::iterator loc = cur.erase(cur.begin()+crange);
cur.insert(loc, R);
cur.insert(loc, L);
//add it to the results
results.push_back(cur);
}
}
}
}
return results;
}
精彩评论