1
我必須做一些明顯的愚蠢在這裏,但有人可以解釋爲什麼它看起來像data.table
不由組操作執行以下操作你能解釋一下這個通過組data.table結果
set.seed(1)
DT = data.table(grp=c(rep('a',100),rep('b',100)), val=c(runif(100), rnorm(100)))
DT[grp=='a',c(-Inf,quantile(val,probs=seq(.1,.9,.1)),Inf)]
10% 20% 30% 40% 50% 60% 70% 80% 90%
-Inf 0.1415 0.2555 0.3448 0.4108 0.4878 0.6442 0.7140 0.7842 0.8703 Inf
DT[grp=='b',c(-Inf,quantile(val,probs=seq(.1,.9,.1)),Inf)]
10% 20% 30% 40% 50% 60% 70% 80% 90%
-Inf -1.22751 -0.66000 -0.55036 -0.32170 -0.11762 0.06583 0.37427 0.69183 1.35196 Inf
DT[,interval:=cut(val,c(-Inf,quantile(val,probs=seq(.1,.9,.1)),Inf)),.(grp)][]
grp val interval
1: a 0.2655 (-0.66,-0.55] => this is a "b" interval ? I would expect (0.2555 0.3448]
2: a 0.3721 (-0.55,-0.322]
3: a 0.5729 (-0.118,0.0658]
4: a 0.9082 (1.35, Inf]
5: a 0.2017 (-1.23,-0.66]
---
196: b -0.7508 (-1.23,-0.66]
197: b 2.0872 (1.35, Inf]
198: b 0.0174 (-0.118,0.0658]
199: b -1.2863 (-Inf,-1.23]
200: b -1.6406 (-Inf,-1.23]
我通常要做這樣的事情:
DT[,mean(val),keyby=.(grp,interval=cut(val,c(-Inf,quantile(val,probs=seq(.1,.9,.1)),Inf)))]
grp interval V1
1: a (-0.321,0.0379] 0.01836077 => this is not a "a" interval
2: a (0.0379,0.21] 0.13190935
3: a (0.21,0.358] 0.29068707
4: a (0.358,0.477] 0.41647597
5: a (0.477,0.648] 0.55190648
6: a (0.648,0.777] 0.70883795
7: a (0.777,0.915] 0.84091210
8: a (0.915, Inf] 0.95797615
9: b (-Inf,-0.657] -1.23322909
10: b (-0.657,-0.321] -0.53243898
11: b (-0.321,0.0379] -0.13968720
12: b (0.0379,0.21] 0.11278201
13: b (0.21,0.358] 0.30783459
14: b (0.358,0.477] 0.40695489
15: b (0.477,0.648] 0.55976052
16: b (0.648,0.777] 0.70483170
17: b (0.777,0.915] 0.91017423
18: b (0.915, Inf] 1.57112705
,這看起來很像,如果間隔是在整個數據集,而不是羣體定義
DT[,c(-Inf,quantile(val,probs=seq(.1,.9,.1)),Inf)]
10% 20% 30% 40% 50% 60% 70% 80% 90%
-Inf -0.65729223 -0.32084835 0.03788176 0.20967534 0.35835115 0.47738589 0.64820328 0.77734560 0.91505885 Inf
是的,'by'或'keyby'中的所有內容都使用未分組的向量。這是混亂嗎? – Frank
他媽的是...... DT [,,。(A,B)]查看A的所有值,B的所有值,然後在每個(A,B)對上進行分組並不是我期待的... pffff – statquant
對,'DT [i,j,by]'的讀數是'i'的子集,然後按'by'分組,然後對每個組做「j」。 – Frank