I'm sure this is an issue anyone who uses Stata for publications or reports has run into:
How do you conveniently export your output to something that can be parsed by a scripting language or Excel?
There are a few ado
files that do this for specific commands. For example:
findit tabout
findit outreg2
But wha开发者_如何学Got about exporting the output of the table
command? Or the results of an anova
?
I would love to hear about how Stata users address this problem for either specific commands or in general.
After experimenting with this for a while, I've found a solution that works for me.
There are a variety of ADOs that handle exporting specific functions. I've made use of outreg2
for regressions and tabout
for summary statistics.
For more simple commands, it's easy to write your own programs to save results automatically to plaintext in a standard format. Here are a few I wrote...note that these both display results (to be saved to a log file) and export them into text files – if you wanted to just save to text you could get rid of the di
's and qui
the sum
, tab
, etc. commands:
cap program drop sumout
program define sumout
di ""
di ""
di "Summary of `1'"
di ""
sum `1', d
qui matrix X = (r(mean), r(sd), r(p50), r(min), r(max))
qui matrix colnames X = mean sd median min max
qui mat2txt, matrix(X) saving("`2'") replace
end
cap program drop tab2_chi_out
program define tab2_chi_out
di ""
di ""
di "Tabulation of `1' and `2'"
di ""
tab `1' `2', chi2
qui matrix X = (r(p), r(chi2))
qui matrix colnames X = chi2p chi2
qui mat2txt, matrix(X) saving("`3'") replace
end
cap program drop oneway_out
program define oneway_out
di ""
di ""
di "Oneway anova with dv = `1' and iv = `2'"
di ""
oneway `1' `2'
qui matrix X = (r(F), r(df_r), r(df_m), Ftail(r(df_m), r(df_r), r(F)))
qui matrix colnames X = anova_between_groups_F within_groups_df between_groups_df P
qui mat2txt, matrix(X) saving("`3'") replace
end
cap program drop anova_out
program define anova_out
di ""
di ""
di "Anova command: anova `1'"
di ""
anova `1'
qui matrix X = (e(F), e(df_r), e(df_m), Ftail(e(df_m), e(df_r), e(F)), e(r2_a))
qui matrix colnames X = anova_between_groups_F within_groups_df between_groups_df P RsquaredAdj
qui mat2txt, matrix(X) saving("`2'") replace
end
The question is then how to get the output into Excel and format it. I found that the best way to import the text output files from Stata into Excel is to concatenate them into one big text file and then import that single file using the Import Text File...
feature in Excel.
I concatenate the files by placing this Ruby code in the output folder and then running int from my Do file with qui shell cd path/to/output/folder/ && ruby table.rb
:
output = ""
Dir.new(".").entries.each do |file|
next if file =~/\A\./ || file == "table.rb" || file == "out.txt"
if file =~ /.*xml/
system "rm #{file}"
next
end
contents = File.open(file, "rb").read
output << "\n\n#{file}\n\n" << contents
end
File.open("out.txt", 'w') {|f| f.write(output)}
Once I import out.txt
into its own sheet in Excel, I use a bunch of Excel's built-in functions to pull the data together into nice, pretty tables.
I use a combination of vlookup
, offset
, match
, iferror
, and hidden columns with cell numbers and filenames to do this. The source .txt file is included in out.txt
just above the contents of that file, which lets you look up the contents of the file using these functions and then reference specific cells using vlookup
and offset
.
This Excel business is actually the most complicated part of this system and there's really no good way to explain it without showing you the file, though hopefully you can get enough of an idea to figure it out for yourself. If not, feel free to contact me through http://maxmasnick.com and I can get you more info.
I have found that the estout package is the most developed and has good documentation.
This is an old question and a lot has happened since it was posted.
Stata now has several built-in commands and functions that allow anyone to export customized output fairly easily:
putexcel
putexcel with advanced syntax
putdocx
putpdf
There are also equivalent Mata
functions / classes, which offer greater flexibility:
_docx*()
Pdf*()
xl()
From my experience, there aren't 100% general solutions. Community-contributed commands such as estout
are now mature enough to handle most basic operations. That said, if you have something that deviates even slightly from the template you will have to program this yourself.
Most tutorials throw in several packages where it would indeed nice to have only one exporting everything, which is what Max suggests above with his interesting method.
I personally use tabout
for summary statistics and frequencies, estout
for regression output, and am trying out mkcorr
for correlation matrixes.
It's been a while, but I believe you can issue a log command to capture the output.
log using c:\data\anova_analysis.log, text
[commands]
log close
I use estpost-- part of the estout package-- to tabulate results from non-estimation commands. You can then store them and export easily.
Here's an example:
estpost corr varA varB varC varD, matrix
est store corrs
esttab corrs using corrs.rtf, replace
You can then add options to change formatting, etc.
You can use asdoc that is available on SSC. To download,
ssc install asdoc
asdoc works well with almost all Stata commands. Specifically, it produces publication quality tables for :
- summarize command - to report summary statistics
- cor or pwcorr command - to report correlations
- tabstat - for flexible tables of descriptive statistics
- tabulate - for one-way, two-way, three-way tabulations
- regress - for detailed, nested, and wide regression tables
- table - flexible tables and many more. You can explore more about asdoc here https://fintechprofessor.com/2018/01/31/asdoc/
精彩评论