smoothing of auto-correlated time series data with ggplot2_问答_开发者

smoothing of auto-correlated time series data with ggplot2

开发者 https://www.devze.com 2023-03-18 08:58 出处：网络

Is there a way to incorporate smoothing function for an auto-correlated time series in ggplot2? I have time series data that is auto-correlated for which I currently use a manual process to determine

Is there a way to incorporate smoothing function for an auto-correlated time series in ggplot2?

I have time series data that is auto-correlated for which I currently use a manual process to determine 95% CI for the fitted spline.

Usage and Date are in a data frame AB. The main components of the model I use are as follows:

    d<-AB$Date
    a<-AB$Usage

    o<-order(d)
    d<-d[o]
    a<-a[o]

    id<-ts(1:length(d))
    a1<-ts(a)

    a2<-lag(a1-1)
    tg<-ts.union(a1,id,a2)
    mg<-lm(a1~a2+bs(id,df=df1), data=tg)

From this model I obtain fitted means and standard errors of the fit which are开发者_C百科 used to work out the 95% CI for the fitted spline.

I have seen examples of the lm method in ggplot2 with a term to specify the model formula. Is this kind of time series model achievable when the time series is auto-correlated?

Thanks.

The CI will be biased if you use the simple formula in ggplot2 for adding any model fit if there is dependence in the residuals.

If I were doing this, I would fit whatever model I wanted outside of gpplot2. Then predict from that model over a grid evenly spaced points in the range of the covariates. Compute confidence intervals for those predictions and combine these and the fitted values and the data into a single data frame. From there you can use geom_line() and geom_ribbon() for the fitted model and confidence interval respectively. This allows you to compute proper confidence intervals that account for the lack of independence in the residuals.

One issue I foresee is that you have a model that includes two covariates, whereas ggplot() would normally consider the relationship between a response and a single covariate. For example, if you are plotting a1 vs id in ggplot but the model is for a2 + bs(id) then you'd need to account a2 in some manner first, say be predicting for a range of values in id but keep a2 fixed at some reasonable value, say the sample mean.