2013-04-22 65 views
6

列表的列創建列我無法弄清楚如何做到以下幾點,從列表中的一列列裝箱的動態數量與data.table從data.table

set.seed(123); N=1e5 
DT = data.table(x=rnorm(N), y=sample(c('a','b','c'),N,T)) 
probs = seq(.1,1,.1); newCols <- paste("q",100*probs,sep=""); 

DT2 <- DT[ ,list(Q=list(quantile(x,probs=probs))),by=y] 
DT2 
# y                   Q 
#1: b -1.2817037351734,-0.840293441466144,-0.525195748246148,-0.259574774974136, 
#2: c -1.26975023312311,-0.832359658553173,-0.513320691339448,-0.247863323660894, 
#3: a -1.28189935066568,-0.838918942382995,-0.522409189372727,-0.257356179072232, 

#Here I want to create 10 columns from Q called q10, q20... 
DT2[ , newCols:=Q] #can't make this work because it is evaluated in the wrong environment I guess 

回答

11

試試這個:

DT2 <- DT[ , as.list(quantile(x,probs=probs)),by=y] 
setnames(DT2, c("y", paste0("q", seq(10, 100, by=10)))) 

# y  q10  q20  q30  q40   q50  q60  q70  q80 
# 1: b -1.281704 -0.8402934 -0.5251957 -0.2595748 -0.001625739 0.2526686 0.5251940 0.8379979 
# 2: c -1.269750 -0.8323597 -0.5133207 -0.2478633 0.003413041 0.2598378 0.5353759 0.8477539 
# 3: a -1.281899 -0.8389189 -0.5224092 -0.2573562 0.001186281 0.2542550 0.5244238 0.8401411 
#   q90  q100 
# 1: 1.284773 3.856234 
# 2: 1.283465 4.322815 
# 3: 1.273615 3.921410 
+0

很好地完成了,我找不到另一個帖子提到這個技巧,雖然我有inpression它是香草......奇怪的是,如果我創建一個名稱爲NA的名稱爲newCols的矢量, DT2 [,名稱(MyNAVec):= Q]'會工作 – statquant 2013-04-22 14:59:34

+0

@statquant,實際上有相當一些最近的帖子(公關由於它沒有明顯的標題,因此不容易找到)。請參閱[** this **](http://stackoverflow.com/a/15510828/559784)和[** this **](http://stackoverflow.com/questions/6902087/proper-fastest-way-到重塑-A-數據表/ 15512437#15512437)。 – Arun 2013-04-22 15:02:23

+0

@statquant,請檢查編輯後的解決方案。我已經修改它*更快*。以前的解決方案在很多組下是低效的(因爲名稱是爲每個組創建的)。 – Arun 2013-04-22 15:03:28