I'm working on an estimation problem using the EM algorithm. The problem is as follows:
You have 3 coins with probabilities of being heads P1, P2, and P3 respectively. You flip coin 1. If coin 1=H, then you flip coin 2; if coin 1=T, then you flip coin 3. You only record whether coin 2 or 3 is heads or tails, not which coin was flipped. So the observations are strings of heads and tails, but nothing else. The problem is to estimate P1, P2, and P3.
My R code to do this is below. It's not working, and I can't figure out why. Any thoughts would be appreciated as I think this is a pretty crafty problem.
Ben
###############
#simulate data
p1<-.8
p2<-.8
p3<-.3
tosses<-1000
rbinom(tosses,size=1,prob=p1)->coin.1
pv<-rep(p3,tosses)
pv[coin.1==1]<-p2
#face now contains the probabilities of a head
rbinom(tosses,size=1,prob=pv)->face
rm(list=(ls()[ls()!="face"]))
#face is all you get to see!
################
#e-step
e.step<-function(x,theta.old) {
fun<-function(p,theta.old,x) {
theta.old[1]->p1
theta.old[2]->p2
theta.old[3]->p3
log(p1*p2^x*(1-p2)^(1-x))*(x*p1*p2+(1-x)*p1*(1开发者_C百科-p2))->tmp1 #this is the first part of the expectation
log((1-p1)*p3^x*(1-p3)^(1-x))*(x*(1-p1)*p3+(1-x)*(1-p1)*(1-p3))->tmp2 #this is the second
mean(tmp1+tmp2)
}
return(fun)
}
#m-step
m.step<-function(fun,theta.old,face) {
nlminb(start=runif(3),objective=fun,theta.old=theta.old,x=face,lower=rep(.01,3),upper=rep(.99,3))$par
}
#initial estimates
length(face)->N
iter<-200
theta<-matrix(NA,iter,3)
c(.5,.5,.5)->theta[1,]
for (i in 2:iter) {
e.step(face,theta[i-1,])->tmp
m.step(tmp,theta[i-1,],face)->theta[i,]
print(c(i,theta[i,]))
if (max(abs(theta[i,]-theta[i-1,]))<.005) break("conv")
}
#note that this thing isn't going anywhere!
You can't estimate P1, P2 and P3 separately. The only useful information is the proportion of recorded heads and the total number of sets of flips (each set of flips is independent, so the order does not matter). This is like trying to solve one equation for three unknowns, and it cannot be done.
The probability of recording a head is P1*P2 + (1-P1)*P3
which in your example is 0.7
and of a tail is one minus that, i.e. P1*(1-P2) + (1-P1)*(1-P3)
in your example 0.3
Here is a simple simulator
#simulate data
sim <- function(tosses, p1, p2, p3) {
coin.1 <- rbinom(tosses, size=1, prob=p1)
coin.2 <- rbinom(tosses, size=1, prob=p2)
coin.3 <- rbinom(tosses, size=1, prob=p3)
ifelse(coin.1 == 1, coin.2, coin.3) # returned
}
The following are illustrations all producing 0.7 (with some random fluctuations)
> mean(sim(100000, 0.8, 0.8, 0.3))
[1] 0.70172
> mean(sim(100000, 0.2, 0.3, 0.8))
[1] 0.69864
> mean(sim(100000, 0.5, 1.0, 0.4))
[1] 0.69795
> mean(sim(100000, 0.3, 0.7, 0.7))
[1] 0.69892
> mean(sim(100000, 0.5, 0.5, 0.9))
[1] 0.70054
> mean(sim(100000, 0.6, 0.9, 0.4))
[1] 0.70201
Nothing you can do subsequently will distinguish these.
精彩评论