I have a complicated combined model for which I can define a likelihood in a function, and I need to optimize the paramete开发者_开发技巧rs. Problem is, the parameters go all directions if not restricted. Hence, I need to implement a restriction on the parameters, and the one proposed by the professor is that the sum of squared parameter values should equal 1.
I've been playing around with both the optim()
and nlm()
function, but I can't really get what I want. First idea was to use n-1 parameters and calculate the last one from the rest, but this doesn't work (as expected).
To illustrate, some toy data and function reflecting the core problem of what I want to achieve:
dd <- data.frame(
X1=rnorm(100),
X2=rnorm(100),
X3=rnorm(100)
)
dd <- within(dd,Y <- 2+0.57*X1-0.57*X2+0.57*X3+rnorm(100,0,0.2))
myfunc2 <- function(alpha,dd){
alpha <- c(alpha,sqrt(1-sum(alpha^2)))
X <- as.matrix(dd[,-4]) %*% alpha
m.mat <- model.matrix(~X)
mod <- glm.fit(m.mat,dd$Y)
Sq <- sum(resid(mod)^2)
return(Sq)
}
b <- c(1,0)
optim(b,myfunc2,dd=dd)
This results obviously in :
Error: (subscript) logical subscript too long
In addition: Warning message:
In sqrt(1 - sum(alpha^2)) : NaNs produced
Anybody an idea on how to implement restrictions on parameters in optimization processes?
PS: I am aware of the fact that this example code doesn't make sense at all. It's just for demonstration purposes.
Edit : Solved it! - See Mareks answer.
I think that Ramnath answer isn't bad, but he make some mistake. The alpha correction should be modified.
This is improved version:
myfunc2 <- function(alpha,dd){
alpha <- alpha/sqrt(sum(alpha^2)) # here the modification ;)
X <- as.matrix(dd[,-4]) %*% alpha
m.mat <- model.matrix(~X)
mod <- glm.fit(m.mat,dd$Y)
Sq <- sum(resid(mod)^2)
return(Sq)
}
b = c(1,1,1)
( x <- optim(b, myfunc2, dd=dd)$par )
( final_par <- x/sqrt(sum(x^2)) )
I got similar results to your unrestricted version.
[EDIT]
Actualy this won't work correctly if start point is wrong. E.g
x <- optim(-c(1,1,1), myfunc2, dd=dd)$par
( final_par <- x/sqrt(sum(x^2)) )
# [1] -0.5925 0.5620 -0.5771
It gives negate of true estimate because mod <- glm.fit(m.mat,dd$Y)
estimates negative coefficient of X
.
I think that this glm re-estimate isn't quite correct. I think you should estimate intercept as mean of residuals Y-X*alpha
.
Something like:
f_err_1 <- function(alpha,dd) {
alpha <- alpha/sqrt(sum(alpha^2))
X <- as.matrix(dd[,-4]) %*% alpha
a0 <- mean(dd$Y-X)
Sq <- sum((dd$Y-a0-X)^2)
return(Sq)
}
x <- optim(c(1,1,1), f_err_1, dd=dd)$par;( final_par <- x/sqrt(sum(x^2)) )
# [1] 0.5924 -0.5620 0.5772
x <- optim(-c(1,1,1), f_err_1, dd=dd)$par;( final_par <- x/sqrt(sum(x^2)) )
# [1] 0.5924 -0.5621 0.5772
you need to provide more details about your constraint. if you are dealing with sum of squares equal to one, an elegant way to solve it using optim is to let the parameters entering optim unconstrained, and reparametrize them within your optimization function.
to illustrate my point, in the example you have stated above, you can get the optimization running by making the following changes to your code:
myfunc2 <- function(alpha,dd){
alpha <- alpha^2/sum(alpha^2);
X <- as.matrix(dd[,-4]) %*% alpha
m.mat <- model.matrix(~X)
mod <- glm.fit(m.mat,dd$Y)
Sq <- sum(resid(mod)^2)
return(Sq)
}
b = c(1,1,1)
optim(b,myfunc2,dd=dd);
ans = b^2/sum(b^2)
this would work for more than 3 variables as well. let me know if this makes sense and if you have additional questions.
It may be a bit trickier than you want, and I don't have the time to work out the details at the moment, but I think you can still do this. Suppose you bound all parameters between 0 and 1 (you can do this with L-BFGS-B) and map the optim() parameters p and your real parameters p' as follows:
p_1' = p_1
p_2' = sqrt(p_2*(1-p_1'^2))
p_3' = sqrt(p_3*(1-(p_1^2+p_2^2))
...
p_n' = 1-sqrt(sum(p_i^2))
or something a bit like that.
精彩评论