2015-08-25 49 views
1

給定是一系列具有開始和結束時間以及擁有者的作業。計算不同行的列之間的difftime

Data <- data.frame(
    job = c(1, 2, 3, 4, 5), 
    owner = c("name1", "name2", "name1", "name1", "name2"), 
    Start = as.POSIXct(c("2015-01-01 15:00:00", "2015-01-01 15:01:00", "2015-01-01 15:13:00", "2015-01-01 15:20:00", "2015-01-01 15:39:02"), format="%Y-%m-%d %H:%M:%S"), 
    End = as.POSIXct(c("2015-01-01 15:11:11", "2015-01-01 15:17:21", "2015-01-01 15:17:00", "2015-01-01 15:31:21", "2015-01-01 15:40:11"), format="%Y-%m-%d %H:%M:%S") 
) 

對於每一行我想計算每個所有者的作業之間的空閒時間。如何使用difftime()來計算特定行和不同列之間的時間差異?

結果應該是這個樣子:

job, owner, idletime 
1, name1, NA 
2, name2, NA 
3, name1, 1.816667 # end of row 1 minus start of row 3 
4, name1, 3.0  # end of row 3 minus start of row 4 
... 

回答

5

下面是使用data.table

library(data.table) # v 1.9.5+ 
setDT(Data)[, idletime := difftime(Start, shift(End), units = "mins"), by = owner] 
# job owner    Start     End  idletime 
# 1: 1 name1 2015-01-01 15:00:00 2015-01-01 15:11:11  NA mins 
# 2: 2 name2 2015-01-01 15:01:00 2015-01-01 15:17:21  NA mins 
# 3: 3 name1 2015-01-01 15:13:00 2015-01-01 15:17:00 1.816667 mins 
# 4: 4 name1 2015-01-01 15:20:00 2015-01-01 15:31:21 3.000000 mins 
# 5: 5 name2 2015-01-01 15:39:02 2015-01-01 15:40:11 21.683333 mins 

還是一個可能的解決方案使用dplyr

library(dplyr) 
Data %>% 
    group_by(owner) %>% 
    mutate(idletime = difftime(Start, lag(End), units = "mins")) 

# Source: local data frame [5 x 5] 
# Groups: owner 
# 
# job owner    Start     End  idletime 
# 1 1 name1 2015-01-01 15:00:00 2015-01-01 15:11:11  NA mins 
# 2 2 name2 2015-01-01 15:01:00 2015-01-01 15:17:21  NA mins 
# 3 3 name1 2015-01-01 15:13:00 2015-01-01 15:17:00 1.816667 mins 
# 4 4 name1 2015-01-01 15:20:00 2015-01-01 15:31:21 3.000000 mins 
# 5 5 name2 2015-01-01 15:39:02 2015-01-01 15:40:11 21.683333 mins 
0
library(dplyr) 


Data <- data.frame(
    job = c(1, 2, 3, 4, 5), 
    owner = c("name1", "name2", "name1", "name1", "name2"), 
    Start = as.POSIXct(c("2015-01-01 15:00:00", "2015-01-01 15:01:00", "2015-01-01 15:13:00", "2015-01-01 15:20:00", "2015-01-01 15:39:02"), format="%Y-%m-%d %H:%M:%S"), 
    End = as.POSIXct(c("2015-01-01 15:11:11", "2015-01-01 15:17:21", "2015-01-01 15:17:00", "2015-01-01 15:31:21", "2015-01-01 15:40:11"), format="%Y-%m-%d %H:%M:%S") 
) 


Data %>% 
    group_by(owner) %>% 
    arrange(Start) %>% 
    mutate(lagEnd = lag(End), 
     idletime = difftime(Start,lagEnd, units="mins")) %>% 
    ungroup %>% 
    arrange(job) %>% 
    select(job,owner,idletime) 

# job owner  idletime 
# 1 1 name1   NA mins 
# 2 2 name2   NA mins 
# 3 3 name1 1.816667 mins 
# 4 4 name1 3.000000 mins 
# 5 5 name2 21.683333 mins 
2

如果我們使用base R,則可以選擇ave。我們使用ave獲得按'所有者'分組的'End'的lag,將其用作difftime中的第二個參數以創建'idtime'。

Data$idtime <- with(Data, difftime(Start, ave(End, owner,FUN=lag), units='mins')) 

Data 
# job owner    Start     End   idtime 
#1 1 name1 2015-01-01 15:00:00 2015-01-01 15:11:11  NA mins 
#2 2 name2 2015-01-01 15:01:00 2015-01-01 15:17:21  NA mins 
#3 3 name1 2015-01-01 15:13:00 2015-01-01 15:17:00 1.816667 mins 
#4 4 name1 2015-01-01 15:20:00 2015-01-01 15:31:21 3.000000 mins 
#5 5 name2 2015-01-01 15:39:02 2015-01-01 15:40:11 21.683333 mins 

注意:我命名的列名作爲「idtime」保持代碼在單行:-)