2015-11-03 70 views
1

我需要基於按時間順序排列的對象(此處爲日期)創建一個值序列(在下面的數據框中命名爲「seq」)。要建立一個新的序列,兩個日期之間的時間間隔需要嚴格大於1小時。基於按時間順序排列的對象創建值序列

下面是一個例子

ID date     seq 
A  2010-04-14 02:00:12 1 
A  2010-04-14 02:00:12 1 
A  2010-04-14 03:00:10 1 
A  2010-04-14 03:00:10 1 
A  2010-04-14 04:00:15 1 
A  2010-04-14 04:00:15 1 
A  2010-04-14 08:00:10 2 
A  2010-04-14 08:00:10 2 
B  2010-04-14 03:00:18 3 
B  2010-04-14 03:00:18 3 
B  2010-04-14 04:00:10 3 
B  2010-04-14 04:00:10 3 
B  2010-04-14 10:00:14 4 
B  2010-04-14 10:00:14 4 
B  2010-04-14 11:00:10 4 
B  2010-04-14 11:00:10 4 

數據

tab <- data.frame(ID= rep(c("A","B"), each=8), date= as.POSIXct(c('2010-04-14 02:00:12','2010-04-14 02:00:12','2010-04-14 03:00:10', '2010-04-14 03:00:10','2010-04-14 04:00:15','2010-04-14 04:00:15','2010-04-14 08:00:10','2010-04-14 08:00:10','2010-04-14 03:00:18','2010-04-14 03:00:18','2010-04-14 04:00:10','2010-04-14 04:00:10','2010-04-14 10:00:14','2010-04-14 10:00:14','2010-04-14 11:00:10','2010-04-14 11:00:10'), format='%Y-%m-%d %H:%M:%S')) 
+0

像'1L + cumsum(DIFF(標籤$日期)> 60 * 60)' – Frank

回答

1

所需輸出似乎不正確,因爲有一個「2010-04-14 03: 00:10「和」2010-04-14 04:00:15「,但是你的序列不會增加。從你的例子中也不清楚當ID變化時序列是否應該增加。

假設seq「2010-04-14三點00分10秒」和「2010-04-14 4點○○分15秒」,並在ID值應該不會影響seq之間應該增加,這裏有一個解決方案:

tab$seq <- c(0, cumsum(abs(diff(tab$date)) > 3600)) + 1 
+0

非常感謝約書亞。我得到了這個錯誤信息:$ < - 。data.frame中的錯誤(* tmp *,「seq」,value = c(1,1,1,2,2,3,: 替換有15行,數據有16' – Pierre

+0

@Pierre:固定。 –

1

這行代碼應該服務宗旨:

tab$seq <- floor(as.numeric(tab$date-min(tab$date))/3600)