I've picked-up the ggplot2 book but I'm struggling to understand how data persists through layers.
For example, lets take a dataset and calculate the mean of each X:
thePlot = ggplot( myDF , aes_string( x = "IndepentVar" , y = "开发者_如何转开发DependentVar" ) )
thePlot = thePlot + stat_summary( fun.y = mean , geom = "point" )
How do I "access" the summary statistics in the next layer? For example, lets say I want to plot a smooth line over the dataset. This seems to work:
thePlot = thePlot + stat_smooth( aes( group = 1 ) , method = "lm" , geom = "smooth" , se = FALSE )
But lets say I want to further ignore a particular X value when generating the line? How do I reference the summarized dataset to express excluding a particular X?
More generally, how is data referenced as it flows through layers? Am I always limited to the last statistics? Can I reference the original dataset?
Here is an attempt at answering your question
- The aesthetics defined in the ggplot call, get used as defaults in all subsequent layers if they are not explicitly defined. That is the reason
geom_smooth
works - You can specify the
data frame
andaesthetics
for each layer separately. For example if you want to exclude some values ofx
while plottinggeom_smooth
, you can specifysubset = .(x != xvalues)
inside thegeom_smooth
call
I can provide more detailed examples, if you have specific questions.
Hope this helps
精彩评论