其實我現在正在努力解決這個問題。這裏是一個
部分解決方案,可通過單個列分組工作:
編輯:與RcppRoll得到了它,我想:
windowed.average <- function(input.table,
window.width = 2,
id.cols = names(input.table)[3],
index.col = names(input.table)[1],
val.col = names(input.table)[2]) {
require(RcppRoll)
avg.with.group <-
input.table[,roll_mean(get(val.col), n = window.width),by=c(id.cols)]
avg.index <-
input.table[,roll_mean(get(index.col), n = window.width),by=c(id.cols)]$V1
output.table <- data.table(
Group = avg.with.group,
Index = avg.index)
# rename columns to (sensibly) match inputs
setnames(output.table, old=colnames(output.table),
new = c(id.cols,val.col,index.col))
return(output.table)
}
A(寫的不好),單元測試會通過上面的:
require(testthat)
require(zoo)
test.datatable <- data.table(Time = rep(seq_len(10), times=2),
Voltage = runif(20),
Channel= rep(seq_len(2),each=10))
test.width <- 8
# first test: single id column
test.avgtable <- data.table(
test.datatable[,rollapply(Voltage, width = test.width, mean, na.rm=TRUE),
by=c("Channel")],
Time = test.datatable[,rollapply(Time, width = test.width, mean, na.rm=TRUE),
by=c("Channel")]$V1)
setnames(test.avgtable,old=names(test.avgtable),
new=c("Channel","Voltage","Time"))
expect_that(test.avgtable,
is_identical_to(windowed.average(test.datatable,test.width)))
如何看起來:
> test.datatable
Time Voltage Channel Class
1: 1 0.310935570 1 1
2: 2 0.565257533 1 2
3: 3 0.577278573 1 1
4: 4 0.152315111 1 2
5: 5 0.836052122 1 1
6: 6 0.655417230 1 2
7: 7 0.034859642 1 1
8: 8 0.572040136 1 2
9: 9 0.268105436 1 1
10: 10 0.126484340 1 2
11: 1 0.139711248 2 1
12: 2 0.336316520 2 2
13: 3 0.413086486 2 1
14: 4 0.304146029 2 2
15: 5 0.399344631 2 1
16: 6 0.581641210 2 2
17: 7 0.183586025 2 1
18: 8 0.009775488 2 2
19: 9 0.449576242 2 1
20: 10 0.938517952 2 2
> test.avgtable
Channel Voltage Time
1: 1 0.4630195 4.5
2: 1 0.4576657 5.5
3: 1 0.4028191 6.5
4: 2 0.2959510 4.5
5: 2 0.3346841 5.5
6: 2 0.4099593 6.5
不幸的是,我沒有設法使其與多個分組工作(因爲這第二部分顯示):
看起來不錯了多列組:
# second test: multiple id columns
# Depends on the first test passing to be meaningful.
test.width <- 4
test.datatable[,Class:= rep(seq_len(2),times=ceiling(nrow(test.datatable)/2))]
# windowed.average(test.datatable,test.width,id.cols=c("Channel","Class"))
test.avgtable <- rbind(windowed.average(test.datatable[Class==1,],test.width),
windowed.average(test.datatable[Class==2,],test.width))
# somewhat artificially attaching expected class labels
test.avgtable[,Class:= rep(seq_len(2),times=nrow(test.avgtable)/4,each=2)]
setkey(test.avgtable,Channel)
setcolorder(test.avgtable,c("Channel","Class","Voltage","Time"))
expect_that(test.avgtable,
is_equivalent_to(windowed.average(test.datatable,test.width,
id.cols=c("Channel","Class"))))
有什麼問題嗎? –
我想在data.table中有'roll'的一些功能,但也許你打算用動物園標記這個? – Frank
這裏沒有SO問題,這是開放的辯論(不是什麼樣的)。但是'data.table'已經實現了一個非常快速的'roll'參數,可以幫助做roll-join,window-join等幾個國王... – statquant