1
我有興趣找到自上次事件以來的天數每個ID。數據是這樣的:R:自上次事件以來的天數每個ID
df <- data.frame(date=as.Date(
c("06/07/2000","15/09/2000","15/10/2000","03/01/2001","17/03/2001",
"06/08/2010","15/09/2010","15/10/2010","03/01/2011","17/03/2011"), "%d/%m/%Y"),
event=c(0,0,1,0,1, 1,0,0,0,1),id = c(rep(1,5),rep(2,5)))
date event id
1 2000-07-06 0 1
2 2000-09-15 0 1
3 2000-10-15 1 1
4 2001-01-03 0 1
5 2001-03-17 1 1
6 2010-08-06 1 2
7 2010-09-15 0 2
8 2010-10-15 0 2
9 2011-01-03 0 2
10 2011-03-17 1 2
我從一個數據表解決方案here大舉借貸,但這並不考慮的ID。
library(data.table)
setDT(df)
setkey(df, date,id)
df = df[event == 1, .(lastevent = date), key = date][df, roll = TRUE]
df[, tae := difftime(lastevent, shift(lastevent, 1L, "lag"), unit = "days")]
df[event == 0, tae:= difftime(date, lastevent, unit = "days")]
它產生如下的輸出
date lastevent event id tae
1: 2000-07-06 <NA> 0 1 NA days
2: 2000-09-15 <NA> 0 1 NA days
3: 2000-10-15 2000-10-15 1 1 NA days
4: 2001-01-03 2000-10-15 0 1 80 days
5: 2001-03-17 2001-03-17 1 1 153 days
6: 2010-08-06 2010-08-06 1 2 3429 days
7: 2010-09-15 2010-08-06 0 2 40 days
8: 2010-10-15 2010-08-06 0 2 70 days
9: 2011-01-03 2010-08-06 0 2 150 days
10: 2011-03-17 2011-03-17 1 2 223 days
但是我的期望的輸出如下所示:
date lastevent event id tae
1: 2000-07-06 <NA> 0 1 NA days
2: 2000-09-15 <NA> 0 1 NA days
3: 2000-10-15 2000-10-15 1 1 NA days
4: 2001-01-03 2000-10-15 0 1 80 days
5: 2001-03-17 2001-03-17 1 1 153 days
6: 2010-08-06 2010-08-06 1 2 NA days
7: 2010-09-15 2010-08-06 0 2 40 days
8: 2010-10-15 2010-08-06 0 2 70 days
9: 2011-01-03 2010-08-06 0 2 150 days
10: 2011-03-17 2011-03-17 1 2 223 days
唯一的區別是所述NA在6行和列TAE。 This是一個沒有答案的相關文章。我看過here,但解決方案不適用於我的情況。還有很多其他問題,但不是每個ID的計算。謝謝!
太簡單了。好痛。非常感謝! –
只是想提及您的代碼不適用於此數據:df < - data.frame(date = as.Date(c(「06/07/2000」,「15/09/2000」,「15/10/2000「,」03/01/2001「,」17/03/2001「, 」18/03/2001「,」06/08/2010「,」15/09/2010「,」15/10/2010年「,」03/01/2011「,」17/03/2011「,」19/03/2011「), 」%d /%m /%Y「),事件= c(1,0,0 ,0,0,0,1,1,1,0,1,0),id = c(rep(1,6),rep(5,6))) –
@HOSS_JFL讓我知道更新是否適合你 – simone