Reading the data the following way
data<-read.csv("userStats.csv", sep=",", header=F)
I tried to 开发者_开发技巧select an element at the specific position.
The example of the data (first five rows) is the following (V2 is the date and V3 is the day of week):
V1 V2
1 00002781A2ADA816CDB0D138146BD63323CCDAB2 2010-09-04
2 00002D2354C7080C0868CB0E18C46157CA9F0FD4 2010-09-04
3 00002D2354C7080C0868CB0E18C46157CA9F0FD4 2010-09-07
4 00002D2354C7080C0868CB0E18C46157CA9F0FD4 2010-09-08
5 00002D2354C7080C0868CB0E18C46157CA9F0FD4 2010-09-17
V3 V4 V5 V6 V7 V8 V9
1 Saturday 2 2 615 1 1 47
2 Saturday 2 2 77 1 1 43
3 Tuesday 1 3 201 1 1 117
4 Wednesday 1 1 44 1 1 74
5 Friday 1 1 3 1 1 18
I tried to divide 6th column with 9th column in the first row the following way:
data[1,6]/data[1,9]
but it returned an error
[1] NA
Warning message:
In Ops.factor(data[1, 6], data[1, 9]) : / not meaningful for factors
Then I tried to select just one element
> data[2,9]
[1] 43
11685 Levels: 0 1 2 3 ... 55311
but don't know what these Levels are and what causes an error. Does anyone know how to select an element at the specific position data[row, column]?
Thank you!
My favorite tool to check variable class is str()
.
What you have there is a data frame and at least one of the columns you're trying to work with is a factor. See Dirk's answer on how to change classes of a column.
Command
data[1,6]/data[1,9]
is selecting the value in the first row of sixth column and dividing with the value in first row of the ninth column. Is this what you want? If you want to use values from the entire column (and not just the first row), you would write
data[6] / data[9]
or
data[, 6] / data[, 9]
Both arguments are equivalent for data.frames.
The standard modeling data structure in R is a data.frame
.
The data.frame
objects can hold various types: numeric
, character
, factor
, ...
Now, when reading data via read.csv()
et al, you can get bitten by the default valus of the stringsAsFactors
option. I presume that at least a row in your data had text, so R decides to decode it as a factor and presto! you no longer can do direct mathematical operations on the column.
In short, do summary(data)
and/or a sweep of class()
over all the columns. Convert as necessary, or turn the stringsAsFactors
variable to a different value or both.
Once your data is numeric, you can divide, slice, dice, ... as you please.
精彩评论