I am trying to write a simple program to combine coefficient and standard error estimates from a set of regression fits. I run, say, 5 regressions, and store the coefficient(s) and standard error(s) of interest into vectors (Stata matrix objects, actually). Then, I need to do the following:
- Find the mean value of the coefficient estimates.
- Combine the standard error estimates according to the formula suggested for combining results from "multiple imputation". The formula is the square root of the formula for "T" on page 6 of this document.
I have written Stata code that does this once, but I want to write this as a function (or "program", in Stata speak) that takes as arguments the vector (or matrix, if possible, to combine multiple estimates at once) of regression coefficient estimates and the vector (or matrix) of corresponding standard error estimates, and then generates 1 a开发者_运维技巧nd 2 above. Here is the code that I wrote:
(breg is a 1x5 vector of the regression coefficient estimates, and sereg is a 1x5 vector of the associated standard error estimates)
mat ones = (1,1,1,1,1)
mat bregmean = (1/5)*(ones*breg’)
scalar bregmean_s = bregmean[1,1]
mat seregmean = (1/5)*(ones*sereg’)
mat seregbtv = (1/4)*(breg - bregmean#ones)* (breg - bregmean#ones)’
mat varregmi = (1/5)*(sereg*sereg’) + (1+(1/5))* seregbtv
scalar varregmi_s = varregmi[1,1]
scalar seregmi = sqrt(varregmi_s)
disp bregmean_s
disp seregmi
This gives the right answer for a single instance. Any pointers would be great!
UPDATE: I completed the code for combining estimates in a kXm matrix of coefficients/parameters (k is the number of parameters, m the number of imputations). Code can be found here.
Thanks to Tristan and Gabi for the pointers.
UPDATE: I completed the code for combining a kXm matrix of coefficients/parameters, where k is the number of coefficients/parameters, and m is the number of imputations. You can find it here.
Thanks to Tristan and Gabi for the hints.
Tristan is right that you can use "args" to call matrices as parameters in a Stata program. Using his template, this hack should replicate the calculation in your original post for any breg, sreg pair of vectors:
capture prog drop myMI
program myMI
args breg sereg
local params=colsof(`breg') // store the number of parameters here
mat ones=J(1,`params',1)
mat bregmean = (1/`params')*(ones*breg')
scalar bregmean_s = bregmean[1,1]
mat seregmean = (1/`params')*(ones*sereg')
mat seregbtv = (1/(`params'-1))*(breg - bregmean#ones)* (breg - bregmean#ones)'
mat varregmi = (1/`params')*(sereg*sereg') + (1+(1/`params'))* seregbtv
scalar varregmi_s = varregmi[1,1]
scalar seregmi = sqrt(varregmi_s)
disp bregmean_s
disp seregmi
end
You call it with
myMI breg sereg
I assume you are aware that multiple imputation, including these types of combination rules, is built into Stata 11. I haven't used it, but it's probably the way to go if possible.
You can easily write a Stata program using the args
command to get the matrices. Follow this template:
capture program drop mi_combine
program define mi_combine
args coef se
matlist `coef'
matlist `se'
end
mat ones = (1,1,1)
mat twos = (2, 2, 2)
mi_combine ones twos
That should help. I am a real novice with Stata. I have v10 and if/when I get v11 I will look into mi. I wanted to use Clarify's mi combine features but I am using models that Clarify doesn't support; I am not so sure mi will either (e.g., a user-supplied censored quantile regression .ado)
Since i posted the q I came up with something that does the job for now:
capture program drop micombine
program define micombine
mat ones = (1,1,1,1,1)
mat bmean = (1/5)*(ones*`1'')
mat wtv = (1/5)*(`2'*`2'')
scalar wtv_s = wtv[1,1]
mat btv = (1/4)*(`1' - bmean#ones)* (`1' - bmean#ones)'
scalar btv_s = btv[1,1]
mat varregmi = wtv_s + (1+(1/5))*btv
scalar varregmi_s = varregmi[1,1]
scalar seregmi = sqrt(varregmi_s)
scalar bmean_s = bmean[1,1]
scalar dfmi = (5-1)*(1+(5*wtv_s)/(6*btv_s))^2
scalar moemi = seregmi*invttail(dfmi,.025)
di as text "b_mi = " as result bmean_s
di as text "se_mi = " as result seregmi
di as text "df_mi =" as result dfmi
di as text "95% moe_mi =" as result moemi
end
Ugly, and only does it for one coefficient estimate. Would love to be able to do it for all the coefs, though that isn't necessary.
If the calculations are complicated, I recommend mata.
If you want to do the calculations in the data, you could use parmest
and/or postfile
to export the regression coefficients into a temporary data file.
精彩评论