2017-03-09 74 views
0

給出一個數據幀在數據幀計算各行

ID days dose1 dose2 dose3 dose4 pattern 
1 TM 2 11.0 45 0.2 0.1 spots 
2 ZZ 18 2.0  6 8.0 0.0 no spots 
3 YY 5 0.4  8 10.0 20.0 no spots 
4 GG 5 0.4  8 10.0 20.0 spots 


df <- structure(list(ID = c("TM", "ZZ", "YY", "GG"), days = c(2L, 18L, 
5L, 5L), dose1 = c(11, 2, 0.4, 0.4), dose2 = c(45L, 6L, 8L, 8L 
), dose3 = c(0.2, 8, 10, 10), dose4 = c(0.1, 0, 20, 20), pattern = c("spots", 
"no spots", "no spots", "spots")), .Names = c("ID", "days", "dose1", 
"dose2", "dose3", "dose4", "pattern"), row.names = c(NA, -4L), class = "data.frame") 

library(data.table) 
setDT(df) 

我想下面給出計算每個行,並將其概括圖案「點」和「無點」 -

dfx <- df[, list(
    Cal1 = sum(dose1>0)/days, 
    Cal2 = sum(dose2>0)/days, 
    Cal3 = sum(dose3>0)/days, 
    Cal4 = sum(dose4>0)/days 
), by=pattern] 

無論如何,我可以像上面那樣計算每行並將其添加到數據幀dfx中嗎?

+0

這不是完全清楚你要計算一下:預期輸出的指示,將有助於。 – neilfws

回答

2

沒有必要計算保存到一個單獨的data.frame

df[, paste0("Cal", 1:4) := .(sum(dose1>0)/days, 
          sum(dose2>0)/days, 
          sum(dose3>0)/days, 
          sum(dose4>0)/days), by = pattern] 
df 
# ID days dose1 dose2 dose3 dose4 pattern  Cal1  Cal2  Cal3  Cal4 
#1: TM 2 11.0 45 0.2 0.1 spots 1.0000000 1.0000000 1.0000000 1.00000000 
#2: ZZ 18 2.0  6 8.0 0.0 no spots 0.1111111 0.1111111 0.1111111 0.05555556 
#3: YY 5 0.4  8 10.0 20.0 no spots 0.4000000 0.4000000 0.4000000 0.20000000 
#4: GG 5 0.4  8 10.0 20.0 spots 0.4000000 0.4000000 0.4000000 0.40000000 
2

如果你有很多的那些dose1dose2等欄目就變得麻煩寫爲每個輸出列一個單獨的表達Cal1Cal2等相反,data.table語法允許編寫簡明

df[, paste0("Cal", 1:4) := lapply(.SD, function(x) sum(x > 0)/days), 
    by = pattern, .SDcols = paste0("dose", 1:4)] 
df 
# ID days dose1 dose2 dose3 dose4 pattern  Cal1  Cal2  Cal3  Cal4 
#1: TM 2 11.0 45 0.2 0.1 spots 1.0000000 1.0000000 1.0000000 1.00000000 
#2: ZZ 18 2.0  6 8.0 0.0 no spots 0.1111111 0.1111111 0.1111111 0.05555556 
#3: YY 5 0.4  8 10.0 20.0 no spots 0.4000000 0.4000000 0.4000000 0.20000000 
#4: GG 5 0.4  8 10.0 20.0 spots 0.4000000 0.4000000 0.4000000 0.40000000 
+0

不錯,'dplyr'具有可比較的速記語法糖,儘管這使得表達更簡單。 –