2016-10-03 337 views
1

我有一個data_framePOSIXct日期時間。我現在想要創建一個變量,將這些日期時間分爲時間段:1 - [00:00:00,08:00:00),2 - [08:00:00,17:00:00) ,3 - [17:00:00,18:30:00],4 - [18:30:00,00:00:00]。R:截止日期時間

下面是一些樣本數據:

df_times = data_frame(
    datetime = seq.POSIXt(
    from = as.POSIXct(strftime("2016-01-01 00:00:00", format = "%Y-%m-%d :%H:%M:%S")), 
    by = "min", 
    length.out = 100000 
), 
    value = rnorm(100000) 
) 

這裏是預期輸出:

> df_times 
# A tibble: 100,000 × 3 
       datetime  value band 
       <dttm>  <dbl> <dbl> 
1 2016-01-01 00:00:00 0.5855288  1 
2 2016-01-01 00:01:00 0.7094660  1 
3 2016-01-01 00:02:00 -0.1093033  1 
4 2016-01-01 00:03:00 -0.4534972  1 
5 2016-01-01 00:04:00 0.6058875  1 
6 2016-01-01 00:05:00 -1.8179560  1 
7 2016-01-01 00:06:00 0.6300986  1 
8 2016-01-01 00:07:00 -0.2761841  1 
9 2016-01-01 00:08:00 -0.2841597  1 
10 2016-01-01 00:09:00 -0.9193220  1 
# ... with 99,990 more rows 

我已經試過cut.POSIXt但堅持跟蹤日期。理想的解決方案將使用dplyr::recodeforcats::

回答

3

這裏是我直接想辦法解決翻譯問題的意圖轉化爲代碼:

set.seed(12345) 

# create a dataset 
df_times = data_frame(
    datetime = seq.POSIXt(
    from = as.POSIXct("2016-01-01 00:00:00", format = "%Y-%m-%d %H:%M:%S"), 
    by = "min", 
    length.out = 100000 
), 
    value = rnorm(100000) 
) %>% 
    mutate(
    time = times(format(datetime, "%H:%M:%S")), 
    cut(
     time, 
     breaks = times(c(
     "00:00:00", 
     "08:00:00", 
     "17:00:00", 
     "18:30:00", 
     "23:59:59" 
    )), 
     labels = c(
     "1", 
     "2", 
     "3", 
     "4" 
    ), 
     include.lowest = TRUE, 
     right = FALSE 
    ) 
) 
+1

'times'函數從哪裏來? – mlevy

2

您可以創建一個hour列,然後切是:

df_times$hour = as.numeric(df_times$datetime) %% (24*60*60)/3600 
df_times$band = cut(df_times$hour, breaks=c(0,8,17,18.5,24), include.lowest=TRUE, 
        right=FALSE) 
+0

嗨eipi10,感謝這個,但我理想地尋找更優雅/可讀的解決方案,因爲我已經有了一個類似的解決方案,但工作起來很麻煩。 – tchakravarty

+0

我發佈了一個我認爲更加緊湊和富有表現力的答案 - 評論歡迎。 – tchakravarty