3
我有這樣的一個表:組織有最大值和最小值的數據r中
這是由下面的代碼生成:
id <- c("1","2","1","2","1","1")
status <- c("open","open","closed","closed","open","closed")
date <- c("11-10-2017 15:10","10-10-2017 12:10","12-10-2017 22:10","13-10-2017 06:30","13-10-2017 09:30","13-10-2017 10:30")
data <- data.frame(id,status,date)
hour <- data.frame(do.call('rbind', strsplit(as.character(data$date),' ',fixed=TRUE)))
hour <- hour[,2]
hour <- as.POSIXlt(hour, format = "%H:%M")
我想達到的目標是選擇最早的開放時間和最新的關閉時間爲爲每個id。所以,最終的結果會是這樣的:
目前我使用sqldf來解決這個問題:
sqldf("select * from (select id, status, date as closeDate, max(hour) as hour from data
where status='closed'
group by id,status) as a
join
(select id, status, date as openDate, min(hour) as hour from data
where status='open'
group by id,status) as b
using(id);")
問題1:有沒有一種簡單的方法來做到這一點?
問題2:如果我選擇max(hour)
任何其他名稱,而不是hour
,結果將不會在日期和時間的格式,但像1507864200
,1507807800
一系列數字。如何在爲列指定不同名稱的同時保持時間格式?
你的意思了'小時「成爲您的數據中的一列?也許你忘了'數據$小時< - 小時'行? – Gregor