組織有最大值和最小值的數據r中

這是由下面的代碼生成：

id <- c("1","2","1","2","1","1") 
status <- c("open","open","closed","closed","open","closed") 
date <- c("11-10-2017 15:10","10-10-2017 12:10","12-10-2017 22:10","13-10-2017 06:30","13-10-2017 09:30","13-10-2017 10:30") 
data <- data.frame(id,status,date) 
hour <- data.frame(do.call('rbind', strsplit(as.character(data$date),' ',fixed=TRUE))) 
hour <- hour[,2] 
hour <- as.POSIXlt(hour, format = "%H:%M")

我想達到的目標是選擇最早的開放時間和最新的關閉時間爲爲每個id。所以，最終的結果會是這樣的：

目前我使用sqldf來解決這個問題：

sqldf("select * from (select id, status, date as closeDate, max(hour) as hour from data 
    where status='closed' 
    group by id,status) as a 
    join 
    (select id, status, date as openDate, min(hour) as hour from data 
    where status='open' 
    group by id,status) as b 
    using(id);")

問題1：有沒有一種簡單的方法來做到這一點？

問題2：如果我選擇max(hour)任何其他名稱，而不是hour，結果將不會在日期和時間的格式，但像1507864200，1507807800一系列數字。如何在爲列指定不同名稱的同時保持時間格式？

來源

2017-10-12 Meilun HE

你的意思了'小時「成爲您的數據中的一列？也許你忘了'數據$小時< - 小時'行？ – Gregor

使用包plyr：

（出於某種原因，如圖所示here，您必須將小時轉換爲as.POSIXct類，否則，你得到一個錯誤消息）：

#add hour to data.frame: 
data$hour <- as.POSIXct(hour) 
library(plyr) 
ddply(data, .(id), summarize, open=min(hour[status=="open"]), 
    closed=max(hour[status=="closed"]))

來源

2017-10-12 20:35:25 user3640617

組織有最大值和最小值的數據r中

回答

相關問題