2017-08-09 22 views
0

我正在嘗試使用dcast.data.table重塑數據,但是當我使用預定義的函數列表時,dcast.data.table會引發錯誤。在dcast.data.table中使用函數列表時出錯

require(data.table) 
require(Hmisc) 

n <- 2 
contributors <- 1:2 
dates <- 2 

DT <- data.table(ID = rep(rep(1:n, contributors), each = dates)) 
DT[, contributor := c(1,1,2,2,2,3)] 
DT[, date := c(1,2,1,1,2,2)] 
DT[, amount := rnorm(.N)] 
DT[, rate := c(1,1,1,3,3,4)] 
DT 
# ID contributor date  amount rate 
# 1: 1   1 1 -1.3888607 1 
# 2: 1   1 2 -0.2787888 1 
# 3: 2   2 1 -0.1333213 1 
# 4: 2   2 1 0.6359504 3 
# 5: 2   2 2 -0.2842529 3 
# 6: 2   3 2 -2.6564554 4 

var.list <- as.list(Cs(amount, rate)) 

collapse <- function(x) paste(x, collapse = ',') 
fun.list <- list(sum, collapse) 

dcast.data.table(data = DT, ID + contributor ~ date, 
       fun.aggregate = fun.list, 
       value.var = var.list, fill = NA) 
# Error in aggregate_funs(fun.call, lvals, sep, ...) : 
# When 'fun.aggregate' and 'value.var' are both lists, 'value.var' must be either of length =1 or =length(fun.aggregate). 

但長度相等:

length(var.list) == length(fun.list) 
# [1] TRUE 

fun.aggregatedcast直接定義,那麼就沒有任何問題:

dcast.data.table(data = DT, ID + contributor ~ date, 
       fun.aggregate = list(sum, collapse), 
       value.var = var.list, fill = NA) 

# ID contributor amount_sum_1 amount_sum_2 rate_collapse_1 rate_collapse_2 
# 1: 1   1 -1.3888607 -0.2787888    1    1 
# 2: 2   2 0.5026291 -0.2842529    1,3    3 
# 3: 2   3   NA -2.6564554    NA    4 

我想知道爲什麼會這樣正在發生,我怎麼能繞過這個錯誤,使用dcast.data.table預定義的功能列表。

+0

看起來像報道[這裏](https://github.com/Rdatatable/data.table/issues/1369) – akrun

回答

1

對於它的價值,你可以建立呼叫dcast用手,用substitute()向用戶提供的列表中的文字傳遞給dcast,像這樣:

z = as.data.table(expand.grid(a=LETTERS[1:3],b=1:3,c=5:6,d=3:4,stringsAsFactors =FALSE))[sample(36,9)] 

myfun = function(DT,fmla,funs,vars) 
    do.call("dcast",list(zz,a~.,fun=substitute(funs),value.var = list('c','d'))) 

myfun(z,a~.,list(sum,mean),list('c','d')) 

>  a c_sum d_mean 
> 1: A 24 3.500000 
> 2: B 10 3.500000 
> 3: C 18 3.333333 

然而,用戶(即誰在這個例子中調用myfun())將不得不提供一個列表文字,因爲這沒有涉及dcast的內部,它將傳遞給參數的AST傳遞給fun.aggregate,該參數需要列表文字。