2016-03-09 56 views
0

我的目標是從數據框中獲取startdate觀察值,並將其與日曆進行比較,如果日期不是商業日期(假日或週末),則將其推回到它成爲一個有效的工作日。我會爲enddate做同樣的事情,但推動它。優化工作日支票(R)

作爲一個例子,這是我的數據是什麼樣子:

tickers startDate endDate 
1 GOOGL 2016-01-31 2016-02-02 
2 GOOGL 2015-10-21 2015-10-23 
3 GOOGL 2015-07-15 2015-07-17 
4 GOOGL 2015-04-22 2015-04-24 
5 GOOGL 2015-01-28 2015-01-30 
6 GOOGL 2014-10-15 2014-10-17 

我的日曆信息:

 Date Weekday Business   Event 
1 2001-01-01 Monday FALSE New Years Day 
2 2001-01-02 Tuesday  TRUE   <NA> 
3 2001-01-03 Wednesday  TRUE   <NA> 
4 2001-01-04 Thursday  TRUE   <NA> 
5 2001-01-05 Friday  TRUE   <NA> 
6 2001-01-06 Saturday FALSE   <NA> 

所以我實現這一目標的下列方式,dplyr:

for(i in 1:10){ 
stocks1 <- stocks1 %>% 
    mutate(startDate = as.Date(ifelse(startDate %in% dates[dates$Business==F,]$Date, startDate - 1, startDate))) %>% 
    mutate(endDate = as.Date(ifelse(endDate %in% dates[dates$Business==F,]$Date, endDate + 1, endDate))) 
} 

我想必須有一個更優雅的方式來做到這一點...任何想法?理想情況下,dplyr,因爲我試圖掌握這個包:)

謝謝!

+0

看到http://stackoverflow.com/questions/20709654/add-1-business-day-to-date-in-r –

回答

0

chron包裝有一些令人愉快的功能,稱爲is.weekendis.holiday,這在這裏非常有幫助。至於優化,實際上這似乎是在R中實際看來很有價值的罕見情況。您仍然需要兩個,除非您想以編程方式進行。

一個警告:is.holiday需要一個假期列表(默認情況下它使用從1992年的六個美國的假期)。我們可以使用第二個data.frame中的日期,其中包括週末的Business == FALSE,但沒關係。真的,如果你的週末數據已經很好,用這種方法你可以完全跳過is.weekend。在這種情況下,兩者的日期不一致,所以它不是非常有用。無論如何,這種方法將使用正確的數據。總之,在那裏df1df2是你的第一個和第二data.frames,分別爲:

library(chron) 
# make a vector of holidays in chron's dates form for is.holiday 
holidays <- chron(dates. = as.character(df2$Date), format = 'y-m-d') 
while(sum(is.weekend(df1$startDate) | is.holiday(df1$startDate, holidays)) > 0){ 
    indices <- is.weekend(df1$startDate) | is.holiday(df1$startDate, holidays) 
    df1$startDate[indices] <- df1$startDate[indices] - 1 
} 
while(sum(is.weekend(df1$endDate) | is.holiday(df1$endDate, holidays)) > 0){ 
    indices <- is.weekend(df1$endDate) | is.holiday(df1$endDate, holidays) 
    df1$endDate[indices] <- df1$endDate[indices] + 1 
} 

或者,你可以帶或不帶chron建立自己的功能。無論如何,base有一個weekday函數可以完成一半的工作。包裹在一個函數中,你可以輕鬆地在dplyrmutate/transmute中查找一個函數,因此它將整齊地插入鏈條中。


數據:

df1 <- structure(list(tickers = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "GOOGL", class = "factor"), 
    startDate = structure(c(16831, 16729, 16631, 16547, 16463, 
    16358), class = "Date"), endDate = structure(c(16833, 16731, 
    16633, 16549, 16465, 16360), class = "Date")), .Names = c("tickers", 
    "startDate", "endDate"), row.names = c(NA, -6L), class = "data.frame") 

df2 <- structure(list(Date = structure(c(11323, 11324, 11325, 11326, 
    11327, 11328), class = "Date"), Weekday = structure(c(2L, 5L, 
    6L, 4L, 1L, 3L), .Label = c("Friday", "Monday", "Saturday", "Thursday", 
    "Tuesday", "Wednesday"), class = "factor"), Business = c(FALSE, 
    TRUE, TRUE, TRUE, TRUE, FALSE), Event = structure(c(2L, 1L, 1L, 
    1L, 1L, 1L), .Label = c("<NA>", "New_Years_Day"), class = "factor")), .Names = c("Date", 
    "Weekday", "Business", "Event"), row.names = c(NA, -6L), class = "data.frame")