2017-10-18 101 views
1

我有如下數據幀:如何遵循按時間

user time 
____ ____ 
1  2017-09-01 00:01:01 
1  2017-09-01 00:01:20 
1  2017-09-01 00:03:01 
1  2017-09-01 00:10:01 
1  2017-09-01 00:11:01 
2  2017-09-01 00:01:03 
2  2017-09-01 00:01:08 
2  2017-09-01 00:03:01 

從這個數據幀我想創建後續組爲每個用戶象下面這樣:

user   time      follow_group 
____ ____________________    _____________        
1  2017-09-01 00:01:01     1 
1  2017-09-01 00:01:20     1 
1  2017-09-01 00:03:01     1 
1  2017-09-01 00:10:01     2 
1  2017-09-01 00:11:01     2 
2  2017-09-01 00:01:03     1 
2  2017-09-01 00:01:08     1 
2  2017-09-01 00:03:01     1 

後續當每個用戶的時間差大於5分鐘時改變組。

我試圖通過採取滯後和減去:

data[, previous_request_time:=c(NA, time[-.N]), by=user] 

但是,這似乎並沒有工作。任何幫助表示讚賞。

回答

4

只要做一個difftime操作並檢查差異是否大於5分鐘。然後累計總和會給你的組計數器:

dat[, 
    follow_group := cumsum(difftime(time, shift(time, fill=-Inf), units="mins") > 5), 
    by=user 
] 

# user    time follow_group 
#1: 1 2017-09-01 00:01:01   1 
#2: 1 2017-09-01 00:01:20   1 
#3: 1 2017-09-01 00:03:01   1 
#4: 1 2017-09-01 00:10:01   2 
#5: 1 2017-09-01 00:11:01   2 
#6: 2 2017-09-01 00:01:03   1 
#7: 2 2017-09-01 00:01:08   1 
#8: 2 2017-09-01 00:03:01   1 

你可以同樣只使用diff,如果你不想太明確有關單位:

dat[, flwgrp := cumsum(c(Inf, diff(time)) > 5*60), by=user]