2011-10-11 77 views
7

我遇到了一個應用程序,我需要按列號對data.frame進行排序,並且usual solutions似乎都不允許這樣做。在列表中訂購

上下文創建as.data.frame.by方法。由於by對象的最後一列將作爲值列,並且第一個ncol-1列將作爲索引列。 melt返回向後排序 - 索引3,然後索引2,然後索引1.爲了與latex.table.by兼容,我想對它進行排序。但是我很難以一種通用的方式來做這件事。以下功能中的註釋行是我迄今爲止的最佳嘗試。

as.data.frame.by <- function(x, colnames=paste("IDX",seq(length(dim(x))),sep=""), ...) { 
    num.by.vars <- length(dim(x)) 
    res <- melt(unclass(x)) 
    res <- na.omit(res) 
    colnames(res)[seq(num.by.vars)] <- colnames 
    #res <- res[ order(res[ , seq(num.by.vars)]) , ] # Sort the results by the by vars in the heirarchy given 
    res 
} 

dat <- transform(ChickWeight, Time=cut(Time,3), Chick=cut(as.numeric(Chick),3)) 
my.by <- by(dat, with(dat,list(Time,Chick,Diet)), function(x) sum(x$weight)) 
> as.data.frame(my.by) 
      IDX1   IDX2 IDX3 value 
1 (-0.021,6.99] (0.951,17.3] 1 3475 
2  (6.99,14] (0.951,17.3] 1 5969 
3  (14,21] (0.951,17.3] 1 8002 
4 (-0.021,6.99] (17.3,33.7] 1 640 
5  (6.99,14] (17.3,33.7] 1 1596 
6  (14,21] (17.3,33.7] 1 2900 
13 (-0.021,6.99] (17.3,33.7] 2 2253 
14  (6.99,14] (17.3,33.7] 2 4734 
15  (14,21] (17.3,33.7] 2 7727 
22 (-0.021,6.99] (17.3,33.7] 3 666 
23  (6.99,14] (17.3,33.7] 3 1391 
24  (14,21] (17.3,33.7] 3 2109 
25 (-0.021,6.99] (33.7,50] 3 1647 
26  (6.99,14] (33.7,50] 3 3853 
27  (14,21] (33.7,50] 3 7488 
34 (-0.021,6.99] (33.7,50] 4 2412 
35  (6.99,14] (33.7,50] 4 5448 
36  (14,21] (33.7,50] 4 8101 

隨着線未註釋的,則返回亂碼(它只是把整個data.frame作爲載體,具有災難性的結果)。

我甚至試過巧妙的東西,如res <- res[ order(...=list(res[,1],res[,2])) , ]但無濟於事。

我懷疑有一個簡單的方法來做到這一點,但我沒有看到它。

編輯澄清:我不想指定列名稱。相反,我希望能夠通過數值向量對它進行排序(例如按列1:4排序)。

回答

7
mydf <- as.data.frame(my.by) 
mydf[order(mydf$IDX3, mydf$IDX2, mydf$IDX1) , ] 
      IDX1   IDX2 IDX3 value 
1 (-0.021,6.99] (0.951,17.3] 1 3475 
3  (14,21] (0.951,17.3] 1 8002 
2  (6.99,14] (0.951,17.3] 1 5969 
4 (-0.021,6.99] (17.3,33.7] 1 640 
6  (14,21] (17.3,33.7] 1 2900 
5  (6.99,14] (17.3,33.7] 1 1596 
13 (-0.021,6.99] (17.3,33.7] 2 2253 
15  (14,21] (17.3,33.7] 2 7727 
14  (6.99,14] (17.3,33.7] 2 4734 
22 (-0.021,6.99] (17.3,33.7] 3 666 
24  (14,21] (17.3,33.7] 3 2109 
23  (6.99,14] (17.3,33.7] 3 1391 
25 (-0.021,6.99] (33.7,50] 3 1647 
27  (14,21] (33.7,50] 3 7488 
26  (6.99,14] (33.7,50] 3 3853 
34 (-0.021,6.99] (33.7,50] 4 2412 
36  (14,21] (33.7,50] 4 8101 
35  (6.99,14] (33.7,50] 4 5448 

或;

my.by <- by(dat, with(dat,list(Diet,Chick, Time)), function(x) sum(x$weight)) 
mydf <- as.data.frame(my.by) 

編輯:

mydf <- as.data.frame(my.by) 
mydf[ do.call(order, mydf[, 3:1]) , ] 
+0

對不起本來應該更清楚:我想不必指定列名或使用該數值列指數產生相同的輸出往上頂。相反,我希望能夠通過數值向量對它進行排序(例如按列1:4排序)。 –

+0

見上文。在'help(order)'頁面上說明了將數據框傳遞給'order'的do.call方法。 –

+0

不錯。謝謝。我需要更仔細地研究'​​do.call',因爲我懷疑它會解決我的許多問題:-) –