开发者

Inputting one column of info into a R data frame.

开发者 https://www.devze.com 2023-01-17 14:02 出处:网络
I am currently using this code to input data from numerous files into R: library(foreign) setwd(\"/Users/ericbrotto/Desktop/A_Intel/\")

I am currently using this code to input data from numerous files into R:

library(foreign)

setwd("/Users/ericbrotto/Desktop/A_Intel/")

filelist <-list.files()

#assuming tab separated values with a header    
datalist = lapply(filelist, function(x)read.table(x, header=T, sep=";", comment.char="")) 

#assuming the same header/columns for all files
datafr = do.call("rbind", datalist) 

The headers look like this:

TIME ;POWER SOURCE ;qty MONITORS ;NUM PROCESSORS ;freq of CPU Mhz ;SCREEN SIZE ;CPU LOAD ;BATTERY LEVEL ; KEYBOARD MVT ; MOUSE MVT ;BATTERY MWH ;HARD DISK SPACE ;NUMBER PROCESSES ;RAM ;RUNNING APPS  ;FOCUS APP ;BYTES IN ;BYTES OUT ;ACTIVE NETWORKS ; IP ADDRESS ; NAMES OF FILES ; 

and an example of the data looks like this:

 2010-09-11-19:28:3开发者_StackOverflow中文版4.680 ; BA ; 1 ; 2 ; 2000 ; 1440 : 900  ; 0.224121 ; 92 ; NO ; NO ; NULL ; 92.581558  ;  57    ; 196.1484375   ; +NULL  ; loginwindow-#35  ;  5259  ;  4506  ; en1 :   ;  192.168.1.3  ;  NULL  ;    

Rather then input all of the columns into a data frame I would like to just grab one, say, FOCUS APP.


If you just want to read in a particular column from your files, then colClasses is the way to go. For example, suppose your data looked like this:

a,b
1,2
3,4

Then

## Use colClasses to select columns
## "NULL" means skip the column
## "numeric" means that the column is numeric
## Other options are Date, factor - see ?read.table for more
## Use NA to let R decide
data = read.table("/tmp/tmp.csv", sep=",", 
                  colClasses=c("NULL", "numeric"), 
                  header=TRUE)

gives just the second column.

> data
  b
1 2
2 4


maybe just adding the column name to your read table line is ok, like this:

datalist = lapply(filelist, function(x)read.table(x, header=T, sep=";", comment.char="")["FOCUS APP"]) 


If you are just doing this once, then the colClasses answer is probably the best (however that still reads in all the data, just only processes the one column). If you are doing things like this often then you may want to use a database instead. Look at the RSQLite, sqldf, and SQLiteDF packages as well as RODBC for some possibilities.

0

精彩评论

暂无评论...
验证码 换一张
取 消