I know that to get a row from a data frame in R开发者_如何学运维, we can do this:
data[row,]
where row is an integer. But that spits out an ugly looking data structure where every column is labeled with the names of the column names. How can I just get it a row as a list of value?
Data.frames created by importing data from a external source will have their data transformed to factors by default. If you do not want this set stringsAsFactors=FALSE
In this case to extract a row or a column as a vector you need to do something like this:
as.numeric(as.vector(DF[1,]))
or like this
as.character(as.vector(DF[1,]))
You can't necessarily get it as a vector
because each column might have a different mode. You might have numeric
s in one column and character
s in the next.
If you know the mode of the whole row, or can convert to the same type, you can use the mode's conversion function (for example, as.numeric()
) to convert to a vector. For example:
> state.x77[1,]
Population Income Illiteracy Life Exp Murder HS Grad Frost
3615.00 3624.00 2.10 69.05 15.10 41.30 20.00
Area
50708.00
> as.numeric(state.x77[1,])
[1] 3615.00 3624.00 2.10 69.05 15.10 41.30 20.00 50708.00
This would work even if some of the columns were integer
s, although they would be converted to numeric
floating-point numbers.
There is a problem with what you propose; namely that the components of data frames (what you call columns) can be of different data types. If you want a single row as a vector, that must contain only a single data type - they are atomic vectors!
Here is an example:
> set.seed(2)
> dat <- data.frame(A = 1:10, B = sample(LETTERS[1:4], 10, replace = TRUE))
> dat
A B
1 1 A
2 2 C
3 3 C
4 4 A
5 5 D
6 6 D
7 7 A
8 8 D
9 9 B
10 10 C
> dat[1, ]
A B
1 1 A
If we force it to drop the empty (column), the only recourse for R is to convert the row to a list to maintain the disparate data types.
> dat[1, , drop = TRUE]
$A
[1] 1
$B
[1] A
Levels: A B C D
The only logical solution to this it to get the data frame into a common type by coercing it to a matrix. This is done via data.matrix()
for example:
> mat <- data.matrix(dat)
> mat[1,]
A B
1 1
data.matrix()
converts factors to their internal numeric codes. The above allows the first row to be extracted as a vector.
However, if you have character data in the data frame, the only recourse will be to create a character matrix, which may or may not be useful, and data.matrix()
now can't be used, we need as.matrix()
instead:
> dat$String <- LETTERS[1:10]
> str(dat)
'data.frame': 10 obs. of 3 variables:
$ A : int 1 2 3 4 5 6 7 8 9 10
$ B : Factor w/ 4 levels "A","B","C","D": 1 3 3 1 4 4 1 4 2 3
$ String: chr "A" "B" "C" "D" ...
> mat <- data.matrix(dat)
Warning message:
NAs introduced by coercion
> mat
A B String
[1,] 1 1 NA
[2,] 2 3 NA
[3,] 3 3 NA
[4,] 4 1 NA
[5,] 5 4 NA
[6,] 6 4 NA
[7,] 7 1 NA
[8,] 8 4 NA
[9,] 9 2 NA
[10,] 10 3 NA
> mat <- as.matrix(dat)
> mat
A B String
[1,] " 1" "A" "A"
[2,] " 2" "C" "B"
[3,] " 3" "C" "C"
[4,] " 4" "A" "D"
[5,] " 5" "D" "E"
[6,] " 6" "D" "F"
[7,] " 7" "A" "G"
[8,] " 8" "D" "H"
[9,] " 9" "B" "I"
[10,] "10" "C" "J"
> mat[1, ]
A B String
" 1" "A" "A"
> class(mat[1, ])
[1] "character"
How about this?
library(tidyverse)
dat <- as_tibble(iris)
pulled_row <- dat %>% slice(3) %>% flatten_chr()
If you know all the values are same type, then use flatten_xxx
.
Otherwise, I think flatten_chr()
is safer.
As user "Reinstate Monica" notes, this problem has two parts:
- A data frame will often have different data types in each column that need to be coerced to character strings.
- Even after coercing the columns to character format, the data.frame "shell" needs to stripped-off to create a vector via a command like
unlist
.
With a combination of dplyr
and base R this can be done in two lines. First, mutate_all
converts all columns to character format. Second, the unlist
commands extracts the vector out of the data.frame structure.
My particular issue was that the second line of a csv included the actual column names. So, I wanted to extract the second row to a vector and use that to assign column names. The following worked to extract the row as a character vector:
library(dplyr)
data_col_names <- data[2, ] %>%
mutate_all(as.character) %>%
unlist(., use.names=FALSE)
# example of using extracted row to rename cols
names(data) <- data_col_names
# only for this example, you'd want to remove row 2
# data <- data[-2, ]
(Note: Using as.character()
in place of unlist
will work too but it's less intuitive to apply as.character
twice.)
I see that the most short variant is
c(t(data[row,]))
However if at least one column in data
is a column of strings, so it will return string vector.
精彩评论