I need to create a function (in R) which:
- given N possible variables to attribute weights to; - creates all possible permuati开发者_Python百科ons of weights (summing to 100%); - subject to the constraint that weights must occur in multiples of P (usually 1%)Obviously, as N and P are inversely related - i.e. I can't specify N=7, and P=0.4. However, I'd like to be able to specify integer solutions only, i.e. P=0.01.
Sorry if this is a well-known problem - I'm not a math person, and I've searched using terms I know, but didn't find anything close enough.
I'd post the code I've written, but.. it's not impressive or insightful.
Thanks for any help!
Assuming the order of the weights matters, these are compositions; if they don't then these are partitions. In either case, they are restricted by the number of parts, what you have called N, though the code which follows uses numparts
. There is also the question of whether weights of 0 are allowed.
Since you want weights to add up to 1, you need 1/p to be an integer, which in the following code is sumparts
; it does not depend on the number of weights. Once you have the compositions, you can multiply them by p, i.e. divide by n
, to get your weights.
R has a partitions
package to generate such compositions or restricted partitions. The following code should be self explanatory: each column in the matrix is a set of weights. I have taken seven weights and p=0.1 or 10%, and prohibited weights of 0: this gives 84 possibilities; allowing weights of 0 would mean 8008 possibilities. With p=0.01 or 1% there would be 1,120,529,256 possibilities without weights of 0, and 1,705,904,746 with. If order does not matter, use restrictedparts
instead of compositions
.
> library(partitions)
> numparts <- 7 # number of weights
> sumparts <- 10 # reciprocal of p
> weights <- compositions(n=sumparts, m=numparts, include.zero=FALSE)/sumparts
> weights
[1,] 0.4 0.3 0.2 0.1 0.3 0.2 0.1 0.2 0.1 0.1 0.3 0.2 0.1 0.2 0.1 0.1 0.2 0.1 0.1
[2,] 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.1 0.2 0.1 0.1 0.2 0.3 0.1 0.2 0.1 0.1 0.2 0.1
[3,] 0.1 0.1 0.1 0.1 0.2 0.2 0.2 0.3 0.3 0.4 0.1 0.1 0.1 0.2 0.2 0.3 0.1 0.1 0.2
[4,] 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.2 0.2 0.2 0.2 0.2 0.2 0.3 0.3 0.3
[5,] 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
[6,] 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
[7,] 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
[1,] 0.1 0.3 0.2 0.1 0.2 0.1 0.1 0.2 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.3 0.2 0.1
[2,] 0.1 0.1 0.2 0.3 0.1 0.2 0.1 0.1 0.2 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.2 0.3
[3,] 0.1 0.1 0.1 0.1 0.2 0.2 0.3 0.1 0.1 0.2 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.1
[4,] 0.4 0.1 0.1 0.1 0.1 0.1 0.1 0.2 0.2 0.2 0.3 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1
[5,] 0.1 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.3 0.3 0.3 0.3 0.4 0.1 0.1 0.1
[6,] 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.2 0.2 0.2
[7,] 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
[1,] 0.2 0.1 0.1 0.2 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.1 0.3
[2,] 0.1 0.2 0.1 0.1 0.2 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.1
[3,] 0.2 0.2 0.3 0.1 0.1 0.2 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1
[4,] 0.1 0.1 0.1 0.2 0.2 0.2 0.3 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.2 0.1 0.1 0.1
[5,] 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.2 0.2 0.2 0.2 0.3 0.1 0.1 0.1 0.1 0.2 0.1 0.1
[6,] 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.3 0.3 0.3 0.3 0.3 0.4 0.1
[7,] 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.2
[1,] 0.2 0.1 0.2 0.1 0.1 0.2 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1
[2,] 0.2 0.3 0.1 0.2 0.1 0.1 0.2 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.2 0.1 0.1 0.1
[3,] 0.1 0.1 0.2 0.2 0.3 0.1 0.1 0.2 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.2 0.1 0.1
[4,] 0.1 0.1 0.1 0.1 0.1 0.2 0.2 0.2 0.3 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.2 0.1
[5,] 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.2 0.2 0.2 0.2 0.3 0.1 0.1 0.1 0.1 0.2
[6,] 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.2 0.2 0.2 0.2 0.2
[7,] 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2
[1,] 0.1 0.2 0.1 0.1 0.1 0.1 0.1 0.1
[2,] 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.1
[3,] 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1
[4,] 0.1 0.1 0.1 0.1 0.2 0.1 0.1 0.1
[5,] 0.1 0.1 0.1 0.1 0.1 0.2 0.1 0.1
[6,] 0.3 0.1 0.1 0.1 0.1 0.1 0.2 0.1
[7,] 0.2 0.3 0.3 0.3 0.3 0.3 0.3 0.4
EDIT : function is updated, as it gave some results twice.
You can try this function, based on recursive calculation. It will give you all possible combinations, regardless of the order. I've done it this way, as you otherwise get a multiple of the rows with all possible permutations.
The calculation is based on integers. The minimum weight P is set as 1, and Pint becomes the number of weight units that can be divided. max.W will be the maximum amount of units that can be given to one variable.
The algorithm goes as follows :
if N=2, then make all possible combinations for the given minimum and maximum weight.
if N > 2, apply this algorithm for N = 1 to ceiling(max.weight / N), with the maximum weight specified as the current maximum weight +1 minus N, and the minimum weight as N.
This gives you all possible combinations of integers. Multiplication with P gives the original weights.
Or in function :
myfunc <- function(N,P){
if(100%%(P*100) !=0){
stop("100% cannot be divided in portions of P")
}
Pint <- 100/(P*100)
max.W <- Pint- N + 1
combs <- function(n,max.w,min){
mw <- max.w + 1
if(n==2){
w <- seq.int(min,floor((mw)/2))
out <- cbind(w,mw-w)
} else if (n > 2){
x <- lapply(1:ceiling(max.w/n),function(i){
newcombs <- combs(n-1,mw-i,i)
cbind(newcombs,rep(i,nrow(newcombs)))
})
out <- do.call("rbind",x)
}
colnames(out) <-rownames(out) <- NULL
out
}
res <- combs(N,max.W)
apply(res,1,sort)*P
}
This gives the combinations in columns of a matrix :
> Y <- myfunc(3,0.1)
> Y
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 0.1 0.1 0.1 0.1 0.2 0.2 0.2 0.3
[2,] 0.1 0.2 0.3 0.4 0.2 0.3 0.4 0.3
[3,] 0.8 0.7 0.6 0.5 0.6 0.5 0.4 0.4
Be warned! with the testcase you gave (7 variables, jumps of 0.01), you'll be calculating a very long time for the huge amounts of possibilities. With N=7 and P=0.04, you have already 3555 possible combinations. With N=0.2, that becomes 336,443 possibilities. And you have to take into account every possible permutation of these combinations if that is what you're after.
精彩评论