我已經在這個數據幀由END TIME排序:檢查重疊的時間間隔開始和結束時間
df = data.frame(ID= c(1,1,1,1,1,1,1), NumberInSequence= c(1,2,3,4,5,6,7),
StartTime = as.POSIXct(c("2016-01-15 18:02:11 GMT","2016-01-15 18:10:33 GMT","2016-01-15 18:25:08 GMT",
"2016-01-15 18:33:56 GMT","2016-01-15 18:21:03 GMT","2016-01-15 19:55:09 GMT","2016-01-15 19:57:03 GMT")) ,
EndTime = as.POSIXct(c("2016-01-15 18:02:17 GMT","2016-01-15 18:10:39 GMT","2016-01-15 18:25:14 GMT",
"2016-01-15 18:34:02 GMT","2016-01-15 19:53:17 GMT","2016-01-15 19:56:15 GMT","2016-01-15 19:58:17 GMT"))
)
每一行是具有開始時間和結束時間的時間間隔
df
ID NumberInSequence StartTime EndTime
1 1 1 2016-01-15 18:02:11 2016-01-15 18:02:17
2 1 2 2016-01-15 18:10:33 2016-01-15 18:10:39
3 1 3 2016-01-15 18:25:08 2016-01-15 18:25:14
4 1 4 2016-01-15 18:33:56 2016-01-15 18:34:02
5 1 5 2016-01-15 18:21:03 2016-01-15 19:53:17
6 1 6 2016-01-15 19:55:09 2016-01-15 19:56:15
7 1 7 2016-01-15 19:57:03 2016-01-15 19:58:17
然後我使用dplyr添加計算下一個開始時間的幾個字段以及NextStartTime和EndTime之間的差異的等待時間。這會創建「WaitTime」列,它在大多數情況下都適用,除非存在重疊的Inverals。
df %>% group_by(ID) %>%
mutate(
NextStartTime = lead(StartTime)[ifelse(lead(NumberInSequence) == (NumberInSequence + 1), TRUE, NA)] ,
WaitTime = difftime(NextStartTime,EndTime, units = 's')
#max_s = max(StartTime) #,
# cum_max_s = as.POSIXct(cummin(as.numeric(StartTime)),origin="1970-01-01")
)
ID NumberInSequence StartTime EndTime NextStartTime WaitTime
1 1 1 2016-01-15 18:02:11 2016-01-15 18:02:17 2016-01-15 18:10:33 496 secs
2 1 2 2016-01-15 18:10:33 2016-01-15 18:10:39 2016-01-15 18:25:08 869 secs
3 1 3 2016-01-15 18:25:08 2016-01-15 18:25:14 2016-01-15 18:33:56 522 secs
4 1 4 2016-01-15 18:33:56 2016-01-15 18:34:02 2016-01-15 18:21:03 -779 secs
5 1 5 2016-01-15 18:21:03 2016-01-15 19:53:17 2016-01-15 19:55:09 112 secs
6 1 6 2016-01-15 19:55:09 2016-01-15 19:56:15 2016-01-15 19:57:03 48 secs
7 1 7 2016-01-15 19:57:03 2016-01-15 19:58:17 <NA> NA secs
現在我需要添加稱爲 「FLAG」 與值是OK或NOT OK柱其中
「OK」指間隔不是enitrely OR部分另一間隔內任一。因此,「OK」的間隔與其他間隔沒有重疊。
「NOT OK」表示間隔IS部分地或完全地以另一間隔爲間隔。因此,「不好」的間隔與其他間隔重疊。
我有以下的間隔和什麼旗柱的結果應該是一個簡短的描述
StartTime EndTime FLAG
2016-01-15 18:02:11 2016-01-15 18:02:17 OK - this interval does not overlap with other intervals
2016-01-15 18:10:33 2016-01-15 18:10:39 OK - this interval does not overlap with other intervals
2016-01-15 18:25:08 2016-01-15 18:25:14 NOT OK - this inerval is within the 18:21:03 start time interval
2016-01-15 18:33:56 2016-01-15 18:34:02 NOT OK - this inerval is within the 18:21:03 start time interval
2016-01-15 18:21:03 2016-01-15 19:53:17 NOT OK - this interval contains other intervals
2016-01-15 19:55:09 2016-01-15 19:56:15 OK - this interval does not overlap with other intervals
2016-01-15 19:57:03 2016-01-15 19:58:17 OK - this interval does not overlap with other intervals
我一直在尋找在dplyr使用芹菜或cummax .....也許...... 。
cum_max_s = as.POSIXct(cummin(as.numeric(StartTime)),origin="1970-01-01")