2016-10-03 43 views
0

乾杯,我有一個具有以下結構的數據幀。週一開始日期(星期日) Week_Start_Date是一週的開始日期。在R的條件中添加數據框中的incrimental日期

DF1: 
Week_Start_Date  Event    Days 
2016-08-14   Independence  4 
2016-01-24   Republic   3 

我想更改DF1(將日期增加一天,直到Days列)。例如:從2016-08-14(Week_Start_Date)到2016-08-17,獨立慶祝爲期4天。

DF2: 
Week_Start_Date  Event    Days 
2016-08-14   Independence  1 
2016-08-15   Independence  2 
2016-08-16   Independence  3 
2016-08-14   Independence  4 
2016-01-24   Republic   1 
2016-01-25   Republic   2 
2016-01-26   Republic   3 

我使用 'dplyr' 包和我試過了,沒有成功,有:

DF2 <- rbind(DF1, DF1 %>% 
mutate(Week_Start_Date = Week_Start_Date + 1:Days, Event=Event, Days = 1:Days)) 

任何人都可以點我到正確的方向?

注:

str(DF1$Week_Start_Date): Date, format: "2016-08-04" 

回答

2

在基R A溶液:

# Sample data 
DF1 <- cbind.data.frame(
    Week_Start_Date = c(as.Date("2016-08-14"), as.Date("2016-01-24")), 
    Event = c("Independence", "Republic"), 
    Days = c(4,3), 
    stringsAsFactors = FALSE); 

# Apply per row, create list and rbind entries 
lst <- apply(DF1, 1, function(x) 
    cbind.data.frame(
     Week_Start_Date = as.Date(x["Week_Start_Date"]) + seq(0, as.numeric(x["Days"]) - 1), 
     Event = x["Event"], 
     Days = seq(1, as.numeric(x["Days"])), 
     row.names = NULL)); 
df <- do.call(rbind, lst); 

# Output 
print(df); 
    Week_Start_Date  Event Days 
1  2016-08-14 Independence 1 
2  2016-08-15 Independence 2 
3  2016-08-16 Independence 3 
4  2016-08-17 Independence 4 
5  2016-01-24  Republic 1 
6  2016-01-25  Republic 2 
7  2016-01-26  Republic 3 
1

如果您Event列不包含重複值,可以使用dplyrtidyr包:

library(dplyr) 
library(tidyr) 
df %>% 
     group_by(Event, Week_Start_Date) %>% 
     complete(Days = sequence(Days)) %>% 
     ungroup() %>% 
     mutate(Week_Start_Date = Week_Start_Date + Days - 1) 

# A tibble: 7 x 3 
#   Event Week_Start_Date Days 
#   <chr>   <date> <int> 
#1 Independence  2016-08-14  1 
#2 Independence  2016-08-15  2 
#3 Independence  2016-08-16  3 
#4 Independence  2016-08-17  4 
#5  Republic  2016-01-24  1 
#6  Republic  2016-01-25  2 
#7  Republic  2016-01-26  3 

更一般地,如果Event列包含重複的值,你可以創建一個行號作爲組變量,這可以通過tibble::rownames_to_column()函數完成。

1

下面是一個選項,在 '天'

library(data.table) 
setDT(df1[rep(seq_len(nrow(df1)), df1$Days),])[, 
    .(Week_Start_Date = Week_Start_Date + seq(.N)-1, Days = seq_len(.N)) , by = Event] 
#   Event Week_Start_Date Days 
#1: Independence  2016-08-14 1 
#2: Independence  2016-08-15 2 
#3: Independence  2016-08-16 3 
#4: Independence  2016-08-17 4 
#5:  Republic  2016-01-24 1 
#6:  Republic  2016-01-25 2 
#7:  Republic  2016-01-26 3 
擴大基於該值的行之後,使用 data.table
相關問題