R：將長格式轉換爲寬格式填寫缺失日期

我正在重塑我公司的小時註冊數據，以適應某種格式。我已將輸入修改爲如下所示：R：將長格式轉換爲寬格式填寫缺失日期

employee project month day hours 
1   A 16-001  9 9  5 
2   B 16-001  9 29  1 
3   A 16-001  9 3  5 
4   B 16-001  9 28  2 
5   A 16-002  9 8  6 
6   B 16-002  9 9  4 
7   A 16-002 10 25  6 
8   B 16-002 10 21  8 
9   A overig 10 6  6 
10  B overig 10 17  7 
11  A overig 10 9  1 
12  B overig 10 10  7 

#reproducicle data: 
df <- data.frame(employee = rep(c("A","B"),6),project=rep(c("16-001","16-002","overig"), each=4), month=rep(c(9,10),each=6),day=sample(1:30,12,replace=T), hours=sample(1:8,12,replace=T)) 

#Now, I need to move this to a cross table: 
res <- ftable(xtabs(hours~month+employee+project+day, aggregate(hours~month+employee+project+day, data=df, FUN=sum))) 

#And put this cross table in a data.frame (for export to csv) 
library(reshape2) 
df_res <- dcast(as.data.frame(res), as.formula(paste(paste(names(attr(res, "row.vars")), collapse="+"), "~", paste(names(attr(res, "col.vars")))))) 

df_res 

    month employee project 3 6 8 9 10 17 21 25 28 29 
1  9  A 16-001 5 0 0 5 0 0 0 0 0 0 
2  9  A 16-002 0 0 6 0 0 0 0 0 0 0 
3  9  A overig 0 0 0 0 0 0 0 0 0 0 
4  9  B 16-001 0 0 0 0 0 0 0 0 2 1 
5  9  B 16-002 0 0 0 4 0 0 0 0 0 0 
6  9  B overig 0 0 0 0 0 0 0 0 0 0 
7  10  A 16-001 0 0 0 0 0 0 0 0 0 0 
8  10  A 16-002 0 0 0 0 0 0 0 6 0 0 
9  10  A overig 0 6 0 1 0 0 0 0 0 0 
10 10  B 16-001 0 0 0 0 0 0 0 0 0 0 
11 10  B 16-002 0 0 0 0 0 0 8 0 0 0 
12 10  B overig 0 0 0 0 7 7 0 0 0 0

我不確定這是最好的方式，但現在格式不錯。然而，我需要把所有的德日作爲列，而不僅僅是我的data.frame中的日子（所以31列，最好是不存在的日期（例如31），其餘爲0。建議如何獲取？

來源

2016-11-23 RHA

我覺得這是一個可以接受的解決方案，它會處理閏年太（加分）。不過趁着tidyr::spread()真好因素填充行爲與drop = F，但現在使用功能lubridate::days_in_month() 。只，但流傳至今這裏，我們去：

library(tidyr) 
library(lubridate) 
library(purrr) 

df$year <- 2016 
df$num_in_month <- ymd(paste(df$year, df$month, df$day)) %>% 
    days_in_month() 

df %>% split(.$month) %>% 
    map(~mutate(., day = factor(day, levels = 1:unique(num_in_month)))) %>% 
    map(~spread(., key = day, value = hours, fill = 0, drop = F)) %>% 
    bind_rows() %>% 
    select(-num_in_month) 

    employee project month year 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
1  A 16-001  9 2016 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 8 0 0 NA 
2  A 16-002  9 2016 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 NA 
3  B 16-001  9 2016 0 0 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 NA 
4  B 16-002  9 2016 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 NA 
5  A 16-002 10 2016 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
6  A overig 10 2016 0 4 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
7  B 16-002 10 2016 0 0 0 0 0 0 0 0 0 0 0 0 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
8  B overig 10 2016 0 0 0 0 6 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

乾杯

來源

2016-11-23 13:41:45 Nate

我不知道（+1），但它ð oes沒有完全回答這個問題。首先'spread'會拋出一個錯誤「行重複標識符」，這實際上可能存在於數據中。其次，所有的日期都充滿了NA，既有存在的日期（如sep-1），也有日期（sep-31）。 – RHA

啊，我誤解了你填寫新生的標準。 – Nate

你是否打算讓這種行爲能夠認識閏年？ – Nate

R：將長格式轉換爲寬格式填寫缺失日期

回答

相關問題