2016-04-17 41 views
1

我有包含以下內容的CSV文件:數據幀的日期時間值行填充

ts1<-read.table(header = TRUE, sep=",", text=" 
    start,   end,   value 
1,26/11/2014 13:00,26/11/2014 20:00,decreasing 
2,26/11/2014 20:00,27/11/2014 09:00,increasing ") 

我想上述dataframe轉移到其中的每一行time列被打開,並與值填充在dataframe 。所述時間間隙從start時間填充到end時間 - 1(減去1),如下:

 date  hour  value 
1 26/11/2014 13:00 decreasing 
2 26/11/2014 14:00 decreasing 
3 26/11/2014 15:00 decreasing 
4 26/11/2014 16:00 decreasing 
5 26/11/2014 17:00 decreasing 
6 26/11/2014 18:00 decreasing 
7 26/11/2014 19:00 decreasing 
8 26/11/2014 20:00 increasing 
9 26/11/2014 21:00 increasing 
10 26/11/2014 22:00 increasing 
11 26/11/2014 23:00 increasing 
12 26/11/2014 00:00 increasing 
13 26/11/2014 01:00 increasing 
14 26/11/2014 02:00 increasing 
15 26/11/2014 03:00 increasing 
16 26/11/2014 04:00 increasing 
17 26/11/2014 05:00 increasing 
18 26/11/2014 06:00 increasing 
19 26/11/2014 07:00 increasing 
20 26/11/2014 08:00 increasing 

我試圖啓動與從所述日期分隔小時:

> t <- strftime(ts1$end, format="%H:%M:%S") 
> t 
[1] "00:00:00" "00:00:00" 

回答

1

我們可以使用data.table。將'data.frame'轉換爲'data.table'(setDT(ts1)),按行的順序分組(1:nrow(ts1)),我們將'start'和'end'列轉換爲datetime類(使用dmy_hm from lubridate),獲取序列by'1小時',format將結果轉換爲預期格式,然後按空格拆分(tstrsplit),與'值'列連接,通過分配NULL刪除'rn'列。最後,我們可以更改列名稱(如果需要)。

library(lubridate) 
library(data.table) 
res <- setDT(ts1)[,{st <- dmy_hm(start) 
        et <- dmy_hm(end) 
        c(tstrsplit(format(head(seq(st, et, by = "1 hour"),-1), 
          "%d/%m/%Y %H:%M"), "\\s+"), as.character(value))} , 
     by = .(rn=1:nrow(ts1)) 
    ][, rn := NULL][] 
setnames(res, c("date", "hour", "value"))[] 
#   date hour  value 
# 1: 26/11/2014 13:00 decreasing 
# 2: 26/11/2014 14:00 decreasing 
# 3: 26/11/2014 15:00 decreasing 
# 4: 26/11/2014 16:00 decreasing 
# 5: 26/11/2014 17:00 decreasing 
# 6: 26/11/2014 18:00 decreasing 
# 7: 26/11/2014 19:00 decreasing 
# 8: 26/11/2014 20:00 increasing 
# 9: 26/11/2014 21:00 increasing 
#10: 26/11/2014 22:00 increasing 
#11: 26/11/2014 23:00 increasing 
#12: 27/11/2014 00:00 increasing 
#13: 27/11/2014 01:00 increasing 
#14: 27/11/2014 02:00 increasing 
#15: 27/11/2014 03:00 increasing 
#16: 27/11/2014 04:00 increasing 
#17: 27/11/2014 05:00 increasing 
#18: 27/11/2014 06:00 increasing 
#19: 27/11/2014 07:00 increasing 
#20: 27/11/2014 08:00 increasing 
1

這是一個使用lubridate和plyr的解決方案。它處理數據的每一行以便從開始到結束進行一個序列,並返回該值。每行的結果合併爲一個數據幀。如果您需要進一步處理結果,最好不要將日期時間分隔日期和時間

library(plyr) 
library(lubridate) 
ts1$start <- dmy_hm(ts1$start) 
ts1$end <- dmy_hm(ts1$end) 

adply(.data = ts1, .margin = 1, .fun = function(x){ 
    datetime <- seq(x$start, x$end, by = "hour") 
    #data.frame(datetime, value = x$value)" 
    data.frame(date = as.Date(datetime), time = format(datetime, "%H:%M"), value = x$value) 
})[, -(1:2)]