2016-07-06 184 views
1

我有一段代碼,我試圖通過將我的數據df_stuff中的缺失分鐘加入到具有整整一年所有分鐘的時間序列中來填充缺失的分鐘。我實際上希望以15分鐘的時間間隔而不是分鐘來彙總這些數據。有沒有人知道這樣做的簡單方法?我正在從xts包中查找to.minutes15,但它似乎與我的POSIXct格式時間序列有問題。15分鐘時間序列

代碼:

library("sqldf") 

##Filling Gaps in time by minute 
myTZ <- "America/Los_Angeles" 
tseries <- seq(as.POSIXct("2015-01-01 00:00:00", tz=myTZ), 
       as.POSIXct("2015-12-31 23:59:00", tz=myTZ), by="min") 

df2 <- data.frame(SeqDateTime=tseries) 
finaldf <- sqldf("select df2.SeqDateTime, 
        median(df_stuff.brooms) as broomsTot 
        from df2 
        left outer join df_stuff on df2.SeqDateTime = df_stuff.broomTime 
        group by df2.SeqDateTime 
        order by df2.SeqDateTime asc") 

數據:

df_stuff <- structure(list(brooms = c(27, 53, 10, 55, 14, 49, 26, 
13, 12, NA, NA, 23, 28, 31, NA, 46, NA, 13, NA, 33, 12, 4, 28, 
34, 0, 24, 7, 31, 33, 37, 56, 41, 50, 55, 41, 15, 23, 26, 14, 
27, 22, 41, 48, 19, 28, 11, 11, NA, 49, NA), broomTime = structure(c(1423970100, 
1424122200, 1424136180, 1424035260, 1424141580, 1424122440, 1423274580, 
1424129580, 1424146320, 1429129320, 1429032060, 1429142940, 1428705000, 
1429142460, 1429128720, 1429204560, 1422909480, 1424137200, 1424042100, 
1424149620, 1424131920, 1424108940, 1424144820, 1424040600, 1424119620, 
1424148660, 1443593040, 1443657120, 1424125860, 1424223120, 1424235240, 
1424232720, 1424234940, 1424234640, 1424230440, 1424115300, 1429208280, 
1429131720, 1429148460, 1429151040, 1424129760, 1424125380, 1424123220, 
1424137380, 1424115780, 1424219340, 1424131560, 1424233560, 1424224920, 
1443640800), class = c("POSIXct", "POSIXt"), tzone = "")), .Names = c("brooms", 
"broomTime"), row.names = c(NA, 50L), class = "data.frame") 
+0

一個簡單的方法是整數除法:'DF $ timeCat < - as.integer(DF $ broomTime)%/%15'將打散分鐘進入15分鐘週期。 – lmo

回答

0

可以通過時間間隔的任何量在dplyr使用cutgroup_by函數中總結。

library(dplyr) 
ans <- finaldf %>% 
     group_by(SeqDateTime = cut(SeqDateTime, breaks = "15 min")) %>% 
     summarize(broomsTot = sum(as.numeric(broomsTot), na.rm = TRUE)) 

head(ans) 
Source: local data frame [6 x 2] 

      SeqDateTime broomsTot 
       (fctr)  (dbl) 
1 2015-01-01 02:00:00   0 
2 2015-01-01 02:15:00   0 
3 2015-01-01 02:30:00   0 
4 2015-01-01 02:45:00   0 
5 2015-01-01 03:00:00   0 
6 2015-01-01 03:15:00   0 
0

我可以向你保證,xts沒有你的POSIXct時間序列的問題。 xts使用POSIXct作爲其內部時間索引。

以下是如何加入df_stuff的1分鐘系列,然後將結果彙總爲15分鐘的系列。

library(xts) 
# create xts object 
xts_stuff <- with(df_stuff, xts(brooms, broomTime)) 
# merge with empty xts object that contains a regular 1-minute index 
xts_stuff_1min <- merge(xts_stuff, xts(,tseries)) 
# aggregate to 15-minutes 
ep15 <- endpoints(xts_stuff_1min, "minutes", 15) 
final_df <- period.apply(xts_stuff_1min, ep15, median, na.rm=TRUE)