I have the following data frame
head(stockdatareturnpercent)
SPY DIA IWM SMH OIH
2001-04-02 8.1985485 7.8349806 7.935566 21.223832 13.975655
2001-05-01 -0.5621328 1.7198760 2.141846 -10.904936 -4.565291
2001-06-01 -2.6957979 -3.5838102 2.786250 4.671762 -23.241009
2001-07-02 -1.0248091 -0.1997433 -5.725078 -3.354391 -9.161594
2001-08-01 -6.1165559 -5.0276558 -2.461728 -6.218129 -13.956695
2001-09-04 -8.8900629 -12.2663267 -15.760037 -39.321172 -16.902913
Actually there are more stocks but for purposes of illustration I had to cut it down. In each month I want to know the best to worst (or worst to best) performers. I played around with the sort() function and this is what I came up with.
N <- dim(stockdatareturnpercent)[1]
for (i in 1:N) {
s <- sort(stockdatareturnpercent[i,])
print(s)
}
UPS FDX XLP XLU XLV DIA IWM SPY XLE XLB XLI OIH XLK SMH MSFT
2001-04-02 0.6481585 0.93135 1.923136 4.712996 7.122751 7.83498 7.935566 8.198549 9.826701 10.13465 10.82522 13.97566 14.98789 21.22383 21.41436
SMH FDX OIH XLK XLE SPY XLU XLP DIA MSFT IWM UPS XLV XLB XLI
2001-05-01 -10.90494 -5.045544 -4.565291 -4.182041 -0.9492803 -0.5621328 0.6987724 1.457579 1.719876 2.088734 2.141846 3.73587 3.748309 3.774033 4.099748
OIH XLE XLI XLU XLP XLB DIA UPS SPY XLV FDX XLK IWM SMH MSFT
2001-06-01 -23.24101 -10.02403 -6.594324 -5.8602 -5.0532 -3.955192 -3.58381 -2.814685 -2.695798 -1.177474 0.4987542 1.935544 2.78625 4.671762 5.374764
MSFT OIH XLK IWM SMH XLV UPS XLE SPY XLU XLB XLI DIA FDX
2001-07-02 -9.793005 -9.161594 -7.17351 -5.725078 -3.354391 -2.016818 -1.692442 -1.159914 -1.024809 -0.9029407 -0.2723560 -0.2078283 -0.1997433 2.868898
XLP
2001-07-02 2.998604
This is a very inefficient and cheap way to see the results. It would be nice to create an object that stores this data. However if I type 's' in the R prompt I only get the value of the last row as each subsequent iteration of the for loop replaces the previous data.
开发者_如何学JAVAI would greatly appreciate some guidance. Thank you kindly.
Use order()
for this, as sort()
drops the names when using *apply
:
id <- t(apply(Data,1,order))
lapply(1:nrow(id),function(i)Data[i,id[i,]])
Using the results of order
in an id matrix also allows you to do eg :
matrix(names(Data)[id],ncol=ncol(Data))
[,1] [,2] [,3] [,4] [,5]
[1,] "DIA" "IWM" "SPY" "OIH" "SMH"
[2,] "SMH" "OIH" "SPY" "DIA" "IWM"
[3,] "OIH" "DIA" "SPY" "IWM" "SMH"
[4,] "OIH" "IWM" "SMH" "SPY" "DIA"
[5,] "OIH" "SMH" "SPY" "DIA" "IWM"
[6,] "SMH" "OIH" "IWM" "DIA" "SPY"
To find out wich ones were the best at a given moment.
If you want to use your loop, you could use lists. as Joshua said, you overwrite s in every loop. Initialize a list to store the results first. This loop gives the same results as the above code with lapply()
, but without the id matrix. There's no gain in speed, although using apply has other benefits :
N <- nrow(Data)
s <- vector("list",N)
for (i in 1:N) {
s[[i]] <- sort(Data[i,])
}
I tested the code using following sample data (please provide your own in the future, using either this example or eg dput()
) :
zz <- textConnection(" SPY DIA IWM SMH OIH
8.1985485 7.8349806 7.935566 21.223832 13.975655
-0.5621328 1.7198760 2.141846 -10.904936 -4.565291
-2.6957979 -3.5838102 2.786250 4.671762 -23.241009
-1.0248091 -0.1997433 -5.725078 -3.354391 -9.161594
-6.1165559 -5.0276558 -2.461728 -6.218129 -13.956695
-8.8900629 -12.2663267 -15.760037 -39.321172 -16.902913 ")
Data <- read.table(zz,header=T)
close(zz)
Using your original code to save each sorted row in a list
:
stockdatareturnpercent <- read.table(textConnection(" SPY DIA IWM SMH OIH
2001-04-02 8.1985485 7.8349806 7.935566 21.223832 13.975655
2001-05-01 -0.5621328 1.7198760 2.141846 -10.904936 -4.565291
2001-06-01 -2.6957979 -3.5838102 2.786250 4.671762 -23.241009
2001-07-02 -1.0248091 -0.1997433 -5.725078 -3.354391 -9.161594
2001-08-01 -6.1165559 -5.0276558 -2.461728 -6.218129 -13.956695
2001-09-04 -8.8900629 -12.2663267 -15.760037 -39.321172 -16.902913"))
x <- vector("list", nrow(stockdatareturnpercent))
## use unlist to drop the data.frame structure
for (i in 1:nrow(stockdatareturnpercent)) {
x[[i]] <- sort(unlist(stockdatareturnpercent[i,]) )
}
## use the row names to name each list element
names(x) <- rownames(stockdatareturnpercent)
x
$`2001-04-02`
DIA IWM SPY OIH SMH
7.834981 7.935566 8.198548 13.975655 21.223832
$`2001-05-01`
SMH OIH SPY DIA IWM
-10.9049360 -4.5652910 -0.5621328 1.7198760 2.1418460
$`2001-06-01`
OIH DIA SPY IWM SMH
-23.241009 -3.583810 -2.695798 2.786250 4.671762
$`2001-07-02`
OIH IWM SMH SPY DIA
-9.1615940 -5.7250780 -3.3543910 -1.0248091 -0.1997433
$`2001-08-01`
OIH SMH SPY DIA IWM
-13.956695 -6.218129 -6.116556 -5.027656 -2.461728
$`2001-09-04`
SMH OIH IWM DIA SPY
-39.321172 -16.902913 -15.760037 -12.266327 -8.890063
For a direct use of apply
to sort each row, but does not preserve the element names:
apply(stockdatareturnpercent, 1, sort)
That returns a matrix where each column is the sorted row. Then transpose:
sortmat <- t(apply(stockdatareturnpercent, 1, sort))
If you need the result as a data.frame, as.data.frame it:
sortdf <- as.data.frame(sortmat)
Finally, all that in one line
sortdf <- as.data.frame(t(apply(stockdatareturnpercent, 1, sort)))
精彩评论