R loops: Adding a column to a table if does not already exist_问答_开发者

R loops: Adding a column to a table if does not already exist

开发者 https://www.devze.com 2023-03-29 12:42 出处：网络

I am trying to compile data from several files using for loops in R开发者_C百科. I would like to get all the data into one table. Following calculation is just an example.

library(reshape)

dat1 <- data.frame("Specimen" = paste("sp", 1:10, sep=""), "Density_1" = rnorm(10,4,2), "Density_2" = rnorm(10,4,2), "Density_3" = rnorm(10,4,2))
dat2 <- data.frame("Specimen" = paste("fg", 1:10, sep=""), "Density_1" = rnorm(10,4,2), "Density_2" = rnorm(10,4,2))

dat <- c("dat1", "dat2")
for(i in 1:length(dat)){
data <- get(dat[i])
melt.data <- melt(data, id = 1)
assign(paste(dat[i], "tbl", sep=""), cast(melt.data, ~ variable, mean))
}

rbind(dat1tbl, dat2tbl)

What is the smoothest way to add an extra column into dat2? I would like to get the same column name ("Density_3" in this case) and fill it up with zeros, if it does not already exist. Assume that I have ~100 tables with number of columns (Density_1, 2, 3 etc) varying between 5 and 6.

I tried following, but it didn't work:

if(names(data) %in% "Density_3" == FALSE){
dat.all$Density_3 <- 0
} else {
dat.all$Density_3 <- dat.all$Density3}

Another one: is there a smooth way to rbind() the tables? It seems that rbind(get(dat)) does not work.

After staring at this question for a while I think its intent may have been obscured by the unnecessary get and assign manipulations. And I think the answer is pylr::rbind.fill

I would have constructed "dat", not as a character vector but as a list of two dataframes, used aggregate( ..., FUN=mean) (because I haven't gotten on the reshape2/plyr bus, except for melt and rbind.fill that is ) and then do.call(rbind.fill, ...) on the resulting list. At any rate this is what I think you want. I do not think it is a good idea to add in zeros for what are really missing values.

> rbind.fill(dat1tbl, dat2tbl)
  value Density_1 Density_2 Density_3
1 (all)  5.006709  4.088988  2.958971
2 (all)  4.178586  3.812362        NA

This is an old post, but in any case: I believe the code you mention above would have worked if you switch the order:

if("Density_3" %in% names(data) == FALSE){
dat.all$Density_3 <- 0
} else {
dat.all$Density_3 <- dat.all$Density3}

As you have it, this part "Density_3" %in% names(data) == FALSE would give you a vector of TRUE/FALSE (for each column), while what you want is only one value, for that specific column. So, you need to ask if that column is present in the data frame, and not the opposite.