Editing cell entries in a variable in a data frame inside a list of data frames_问答_开发者

Editing cell entries in a variable in a data frame inside a list of data frames

开发者 https://www.devze.com 2023-03-14 08:20 出处：网络

Define: > dats <- list( df1 = data.frame(a=sample(1:3), b = as.factor(rep(\"325.049072M\",3))),

Define:

> dats <- list( df1 = data.frame(a=sample(1:3), b = as.factor(rep("325.049072M",3))),
+       df2 = data.frame(a=sample(1:3), b = as.factor(rep("325.049072M",3))))
> dats
$df1
  a           b
1 3 325.049072M
2 2 325.049072M
3 1 325.049072M

$df2
  a           b
1 2 325.049072M
2 1 325.049072M
3 3 325.049072M

I want to remove the M character from column b in each data frame.

In a simple framework:

> t<-c("325.049072M","325.049072M")
> t
[1] "325.049072M" "325.049072M"
> t <- substr(t, 1, nchar(t)-1)
> t
[1] "325.049072" "325.049072"

But in a nested one, how to proceed? Here is one sorry attempt:

> dats <- list( df1 = data.frame(a=sample(1:3), b = as.factor(rep("325.049072M",3))),
+       df2 = data.frame(a=sample(1:3), b = as.factor(rep("325.049072M",3))))
> dats
$df1
  a           b
1 3 325.049072M
2 1 325.049072M
3 2 325.049072M

$df2
  a           b
1 2 325.049072M
2 3 325.049072M
3 1 325.049072M

> for(i in seq(along=dats)) {
+   dats[[i]]["b"] <- 
+           substr(dats[[i]]["b"], 1, nchar(dats[[i]]["b"])-1)
+ }
> dats
$df1
  a         b
1 3 c(1, 1, 1
2 1 c(1, 1, 1
3 2 c(1, 1, 1

$d开发者_C百科f2
  a         b
1 2 c(1, 1, 1
2 3 c(1, 1, 1
3 1 c(1, 1, 1

You can do this with lapply (and some coercion):

stripM <- function(x){
x$b <- substr(as.character(x$b),1,nchar(as.character(x$b))-1)
x
}
lapply(dats,FUN=stripM)

If you need that variable as a factor, you can include a line in stripM that converts is back to a factor, something like x$b <- as.factor(x$b).

Try using gsub instead of substr - something like this:

lapply(<data.frame or list>, function(x) as.numeric(gsub("M$", "", x)))

of course, you need to figure out how are you going to recurse into list elements etc. but I guess you get the picture...

Ok, here is another possibility, not neat, but intelligible:

for(i in seq(along=dats)) {
    c <- as.character(dats[[i]][["b"]])
    c <- substr(c, 1, nchar(c)-1)
    dats[[i]][["b"]] <- c
    dats
}
dats

I have to say that I find the whole [[ versus [ referencing very cryptic.

> str(dats[[i]][["b"]])
 chr [1:3] "325.049072" "325.049072" "325.049072"
> str(dats[[i]]["b"])
'data.frame':   3 obs. of  1 variable:
 $ b: chr  "325.049072" "325.049072" "325.049072"

I proceed by trial and error. Any pointers to a good explanation?