增長速度rbindlist不能有兩個連續工作循環

我有一個數據集，看起來像這樣的：增長速度rbindlist不能有兩個連續工作循環

test <- data.table(Weight=sample(x = c(20:100),500,replace = T),y=rnorm(500),z=rnorm(500)) 

> head(test) 
    Weight   y   z 
1:  87 -0.7946846 -0.03136408 
2:  97 1.6570765 0.61080309 
3:  80 1.1592073 -0.09389739 
4:  23 -0.0268602 -1.36896141 
5:  32 1.3171078 -2.19978789 
6:  78 -0.1961162 0.62026338

我想複製每一行多次weight.I下的值已經實現了這個用下面的代碼：（我包括一個進度條）

system.time(
    for (i in 1:nrow(test)){ 
    setTxtProgressBar(pb,i) 
    for (j in 1:test[i,]$Weight){ 
     Testoutcome <- rbind(Testoutcome, test[i,]) 
    } 
    }) 
user system elapsed 
    32.91 0.08 33.57

我發現一個帖子here這說明了rbindlist比rbind快得多。所以我修改了這樣的代碼：

system.time(
    for (i in 1:nrow(test)){ 
    setTxtProgressBar(pb,i) 
    for (j in 1:test[i,]$Weight){ 
     Testoutcome <- rbindlist(list(Testoutcome, test[i,])) 
    } 
    }) 
user system elapsed 
    27.72 0.05 28.31

所以它似乎沒有那麼有效。我的實際數據集是大約1.000倍，查詢需要永遠......任何想法如何加快？也許我應該得到循環外的綁定？

來源

2015-04-03 Tim_Utrecht

這可能是相關的：http://stats.stackexchange.com/questions/25148/how-to-expand-data-frame-in -r – Frank 2015-04-03 14:04:27

啊，這裏有一個精確的匹配，雖然它不詢問速度：http://stackoverflow.com/questions/19518728/r-replicate-each-row-of-an-r-data-frame-並指定複製數量 – Frank 2015-04-03 14:09:14

這應該是快速的，而且是相當簡單：

test[rep(1:.N,Weight)]

來源

2015-04-03 14:10:21 Frank

謝謝，過分關注rbind和rbindlist。 – 2015-04-03 14:38:32

增長速度rbindlist不能有兩個連續工作循環

回答

相關問題