我試圖解決dplyr下面的問題,並設法取得一些進展,但我在某些時候面臨的問題很少。cumsum和if在dplyr條件在r不給予預期的輸出
問題陳述
在每個組(由ID分組)的,如果相同的ID的當前HID和先前HID是不同的,並且間隔< 30,則罰列應顯示來自金額的值。在所有其他情況下,它應該顯示0(其他條件可能意味着要麼的HID是相同,或HID之不同,但間隔> = 30)
數據
"ID","DaysToEvent","HID","Interval","Amount"
2197560,16369,"011",29,90105
2197560,16494,"121",29,50526
2197560,16509,"121",29,194568
2197560,16569,"001",31,27236
2197560,16577,"128",29,17309
2197578,14447,"001",29,17276
2197578,14468,"021",29,12661
2197578,14489,"001",31,15015
2197578,14517,"001",29,19000
2197578,14517,"02P",29,19001
2197578,14517,"001",31,19002
2197578,14517,"001",29,19003
2197578,14517,"001",29,19004
我的代碼
mycoredata2009 = read.csv('path/to/abovefile.csv')
CumulativeCumulativeCost = 0;
mycoredata2009 = mycoredata2009 %>%
group_by(ID) %>%
mutate(Penalty = ifelse(((HID != lag(HID)) & Interval < 30) ,Amount,0)) %>%
mutate(CumulativeCost=cumsum(as.numeric(Penalty))) %>%
CumulativeCumulativeCost = cumsum(as.numeric(CumulativeCost)) %>%
cat(paste("For group with ID==",ID,"CumulativeCost==", CumulativeCost,sep=""))
mycoredata2009 = as.data.frame(mycoredata2009)
問題,我目前正面臨着
然而,有幾個問題與代碼
的刑罰欄顯示金額,即使當前HID 和以前的HID是相同的數值。(正常工作的另外兩個 條件)
這應該是 運行成本的罰金列總是顯示NA
在每個組的結尾處的CumulativeCost柱,我想打印個的CumulativeCost在 組,並保持插入該 組的ID和CumulativeCost成最終輸出數據幀
- 我也希望有一個稱爲CumulativeCumulativeCost 可變其中,顧名思義是每個CumulativeCost 的運行總和組。
接收的輸出
ID DaysToEvent HID Interval Amount Penalty CumulativeCost
1 2197560 16369 011 29 90105 NA NA
2 2197560 16494 121 29 50526 50526 NA
3 2197560 16509 121 29 194568 194568 NA
4 2197560 16569 001 31 27236 0 NA
5 2197560 16577 128 29 17309 17309 NA
6 2197578 14447 001 29 17276 NA NA
7 2197578 14468 021 29 12661 12661 NA
8 2197578 14489 001 31 15015 0 NA
9 2197578 14517 001 29 19000 19000 NA
10 2197578 14517 02P 29 19001 19001 NA
11 2197578 14517 001 31 19002 0 NA
12 2197578 14517 001 29 19003 19003 NA
13 2197578 14517 001 29 19004 19004 NA
預期輸出(手算)
ID DaysToEvent HID Interval Amount Penalty CumulativeCost
1 2197560 16369 011 29 90105 NA NA
2 2197560 16494 121 29 50526 50526 50526
3 2197560 16509 121 29 194568 0 50526
4 2197560 16569 001 31 27236 0 50526
5 2197560 16577 128 29 17309 17309 67835
6 2197578 14447 001 29 17276 NA NA
7 2197578 14468 021 29 12661 12661 12661
8 2197578 14489 001 31 15015 0 12661
9 2197578 14517 001 29 19000 0 12661
10 2197578 14517 02P 29 19001 19001 31662
11 2197578 14517 001 31 19002 0 31662
12 2197578 14517 001 29 19003 0 31662
13 2197578 14517 001 29 19004 0 31662