I have a tricky problem with applying a function to a list of data frames. Ultimately I want to plot individual time series charts for large data set of drug usage figures.
My dataset comprises 30 different antibiotics with a usage rate that has been collected monthly over a 5 year period. It has 3 columns and 1692 rows.
So far I have made a list of individual data frames for each antibiotic class. (The name of the list is drug and drug.class is a character vector of drug names from the original data frame)
drugList <- list()
n<-length(drug.class)
for (i in 1:n){
drugList[[i]] <-AB[Drug==(drug.class[i]),]
}
For example, I have 30 开发者_开发知识库data frames in a list with the following columns:
[[29]]
Drug Usage DateA
1353 Tobramycin 5.06 01-Jan-2006
1354 Tobramycin 4.21 01-Feb-2006
1355 Tobramycin 6.34 01-Mar-2006
.
.
.
Drug Usage DateA
678 Vancomycin 11.62 01-Jan-2006
679 Vancomycin 11.94 01-Feb-2006
680 Vancomycin 14.29 01-Mar-2006
Before each plot is made a logical test is performed to determine if the time series is autocorrelated. The data frmaes in the list are of verying lengths. I have written a function to perform the test as follows:
acTest <- function(){
id<-ts(1:length(DateA))
a1<-ts(Usage)
a2<-lag(a1-1)
tg<-ts.union(a1,id,a2)
mg<-lm(a1~a2+bs(id,df=3), data=tg)
a2Pval <- summary(mg)$coefficients[2, 4]
if (a2Pval<=0.05) {
TRUE
} else {
FALSE
}
}
I have previously tested all my functions on individual data frames and they work as expected.
I am trying to work out how to apply the test to each data frame in the drug list. I believe if I can get help working this out I will be in a position to apply the time series functions in the same manner.
Thanks in advance for any assistance offered.
A few suggestions:
Change your acTest
function so that it actually accepts a data.frame as a parameter. Otherwise you'll have lots of problems with the function looking for (and modifying) objects named DateA and Usage in the global environment.
acTest <- function(dat){
id<-ts(1:length(dat$DateA))
a1<-ts(dat$Usage)
a2<-lag(a1-1)
tg<-ts.union(a1,id,a2)
mg<-lm(a1~a2+bs(id,df=3), data=tg)
a2Pval <- summary(mg)$coefficients[2, 4]
if (a2Pval<=0.05) {
TRUE
} else {
FALSE
}
}
Applying a function to each element of a list is a common task in R. It is (most often) done using lapply
.
lapply(drugList,FUN=acTest)
Finally, you can do tasks like this without storing each data frame as a separate list element by using tools like ddply
(among others) that split a data frame using one variable, apply a function to each piece and then reassemble them into a single data frame again. In your case, that would look something like:
ddply(AB,.(Drug),.fun = acTest)
精彩评论