I have a table with following lines
df <- data.frame(Time=c(1,3),date=c(23,12),
people=c("Apple&June&Peter","Apple&May&Mary"),stringsAsFactors=FALSE)
Time date people
1 23 Apple&June&Peter
3 12 App开发者_运维技巧le&May&Mary
I need to separate them into different rows:
Time date people
1 23 Apple
1 23 June
1 23 Peter
3 12 Apple
3 12 May
3 12 Mary
I know reshape + colsplit can be used to split the people column into different column on the same row.
How about row? How can I split them into different row but same column?
A base way of doing this, using strsplit :
as.data.frame(
t(
do.call(cbind,
lapply(1:nrow(df),function(x){
sapply(unlist(strsplit(df[x,3],"&")),c,df[x,1:2],USE.NAMES=FALSE)
})
)
)
)
V1 Time date
1 Apple 1 23
2 June 1 23
3 Peter 1 23
4 Apple 3 12
5 May 3 12
6 Mary 3 12
df <- data.frame(Time=c(1,3),date=c(23,12),
people=c("Apple&June&Peter","Apple&May&Mary"),stringsAsFactors=FALSE)
long.people=strsplit(df$people,"&")
el.len=sapply(long.people,length)
new.df=data.frame(Time=rep(df$Time,el.len),date=rep(df$date,el.len),people=unlist(long.people))
new.df
Time date people
1 1 23 Apple
2 1 23 June
3 1 23 Peter
4 3 12 Apple
5 3 12 May
6 3 12 Mary
A variation on the reshape
solution, using stringr
for more convenient splitting of the names strings.
library(reshape)
library(stringr)
wide_df <- cbind(df[, 1:2], str_split_fixed(df[, 3], "&", 3))
long_df <- melt(wide_df, id.vars = c("Time", "date"))
long_df$variable <- NULL
names(long_df)[3] <- "people"
long_df
You can use colsplit and then reshape the resulting data.frame back to long form, then just drop the ID column that the reshape creates:
library(reshape)
df <- data.frame(time=c(1,3),date=c(23,12),people=c("Apple&June&Peter","Apple&May&Mary"))
pnames <- paste("people",seq(3),sep=".")
df.new <- cbind(df[,seq(2)],colsplit(df$people,"&",pnames))
df.new <- reshape(df.new,varying=pnames,direction="long")
df.new <- subset(df.new,select=c(-id))
df.new
time date people
1.1 1 23 Apple
2.1 1 12 Apple
1.2 2 23 June
2.2 2 12 May
1.3 3 23 Peter
2.3 3 12 Mary
精彩评论