Shape的答案是解決您的問題的正確方法。
爲了擴展Shape的答案,我想用一個更通用的解決方案做出貢獻。
eav函數包dwtools旨在通過更簡單的度量計算來解決Entity-attribute-value數據結構。功能定義如下,你不需要dwtools包。
它爲每個組計算rm
變量。計算公式可以與引用j
參數[.data.table
在融化EAV之後以及在再次轉換爲EAV之前的調用相同。
library(data.table)
eav = function(x, j, id.vars = key(x)[-length(key(x))], variable.name = key(x)[length(key(x))], measure.vars = names(x)[!(names(x) %in% key(x))], fun.aggregate = sum, shift.on = character(), wide=FALSE){
stopifnot(is.data.table(x))
r <- x[,lapply(.SD,fun.aggregate),c(id.vars,variable.name),.SDcols=measure.vars
][,dcast(.SD,formula=as.formula(paste(paste(id.vars,collapse=' + '),paste(variable.name,collapse=' + '),sep=' ~ ')),fun.aggregate=fun.aggregate,value.var=measure.vars)
][,eval(j), by = eval(id.vars[!(id.vars %in% shift.on)])
]
if(wide) r[] else melt(r,id.vars=id.vars, variable.name=variable.name, value.name=measure.vars)[,.SD,keyby=c(id.vars,variable.name)]
}
df = data.frame(day = c(1, 1, 2, 2, 3, 3), var = c("a", "b", "a", "b", "a", "b"), value = c(1, 2, 3, 3, 2, 1))
dt = as.data.table(df)
setkey(dt, day, var)
r = eav(dt, quote(rm := as.numeric(a >= b)))
print(r)
# day var value
#1: 1 a 1
#2: 1 b 2
#3: 1 rm 0
#4: 2 a 3
#5: 2 b 3
#6: 2 rm 1
#7: 3 a 2
#8: 3 b 1
#9: 3 rm 1
r[, if(value[var=="rm"] == 0) .SD, by = day
][var!="rm"] # you need to exclude temporary variable
# day var value
#1: 1 a 1
#2: 1 b 2
該解決方案還可能慢於形的(你可以填充你的大數據,因此它可以被測量的樣品),但可能是複雜的計算更容易在EAV很多措施,並且支持換班 - 見eav examples。
是否有原因,你不想從長切換到寬?通過dcast或tidyr? – Shape
@Shape主要是因爲代碼效率。 – Vedda