我有一些data.table
的金額欄等間隔的行:骨料data.table到原始值
n = 1e5
set.seed(1)
dt <- data.table(id = 1:n, amount = pmax(0,rnorm(n, mean = 5e3, sd = 1e4)))
和休息的向量給定,如:
breaks <- as.vector(c(0, t(sapply(c(1, 2.5, 5, 7.5), function(x) x * 10^(1:4)))))
對於定義的每個間隔通過這些休息時間,我想使用data.table
語法:
- 獲得計數
amount
包含 - 得到
amount
等於或大於計數比約束左側(基本n * (1-cdf(amount))
爲1,這主要是工作,但對於空的間隔不返回行:
dt[, .N, keyby = breaks[findInterval(amount,breaks)] ] #would prefer to get 0 for empty intvl
對於2,我想:
dt[, sum(amount >= thresh[.GRP]), keyby = breaks[findInterval(amount,breaks)] ]
,但它沒有工作,因爲sum
是GRO內受限於沒有超越。因此,與一個解決辦法,這也返回空區間上來:
dt[, cbind(breaks, sapply(breaks, function(x) sum(amount >= x)))] # desired result
那麼,有什麼解決的辦法data.table
我2,並獲得兩個空的間隔?
查看關於'foverlaps'的一些問題,只有幾個[1](http://stackoverflow.com/questions/25815032/finding-overlaps-between-interval-sets-efficient-overlap-joins),[2] (http://stackoverflow.com/questions/28540466/how-to-identify-overlaps-in-multiple-columns),[3](http://stackoverflow.com/questions/34245295/efficient-method-for-計數開箱每次提交在拉),[4](http://stackoverflow.com/questions/27574775/is-it-possible-to-use-the -r-data-table-funcion-foverlaps-to-find-the-intersectio) – MichaelChirico