2016-12-13 43 views
3

我試圖產生一個類似於功能的代碼片段,如動物園/ xts中的rollapply,但適用於我的需要。我使用一些非常簡單的示例數據生成了代碼,並且一切正常。但是現在我試圖在edhec數據上運行它,我收到一個錯誤。我不清楚爲什麼,但認爲這與if語句有關。有人能夠診斷我爲什麼收到錯誤嗎?如果聲明錯誤/不應用如果語句

#rm(list=ls()) #Clear environment 
cat("\014") #CTRL + L 

library(xts) 
library(lubridate) 

is.even <- function(x) x %% 2 == 0 

roundUp <- function(x,to=2) 
{ 
    to*(x%/%to + as.logical(x%%to)) 
} 

functionTest <- function(data, window, slide){ 

    nyears_t = nyears(data) 

    #IF statement for non-even numbers only 
    if(is.even(nyears_t == FALSE)) { 
    nyears_t <- roundUp(nyears_t) 
    data_extend <- data 

    start_extend <- .indexyear(data)[length(data)]+ 1900 + 1 
    end_extend <- start_extend + length(data) - 1 
    index(data_extend) <- update(index(data),year=start_extend:end_extend) 

    data <- rbind(data, data_extend) 

    warning("WARNING! The function has looped to the start of the timeseries. The final list(s) 
      will contain years that do not exist in the dataset. Please modify.") 
    } 

    nslides = nyears_t/slide 

    #Matrix 
    year_1 = (.indexyear(data)[1]+1900) 

    start <- seq(from = year_1, by = slide, length.out = nslides) 
    end <- start + window - 1 

    mat <- matrix(c(start, end), ncol = 2, dimnames = list(c(1:nslides), c("start", "end"))) 

    #For loop 
    subsetlist <- vector('list') 

    for(i in 1:nslides){ 
    subset <- data[paste0(mat[i,1], "/", mat[i,2])] 
    subsetlist[[i]] <- subset 
    } 
    print(subsetlist) 
} 
這是當我正在上面的功能中使用的

樣品的編號:

a <- seq(from = as.POSIXct("2000", format = "%Y"), to = as.POSIXct("2008", format = "%Y"), by = "year") 
a <- as.xts(1:length(a), order.by = a) 
a 

functionTest(data = a, window = 3, slide = 2) 

示例代碼我測試上並接收一個錯誤:

> data(edhec, package = "PerformanceAnalytics") 
> edhec <- edhec[,1:3] 
> edhec <- edhec["/2007"] 
> head(edhec) 
      Convertible Arbitrage CTA Global Distressed Securities 
1997-01-31    0.0119  0.0393    0.0178 
1997-02-28    0..0298    0.0122 
1997-03-31    0.0078 -0.0021    -0.0012 
1997-04-30    0.0086 -0.0170    0.0030 
1997-05-31    0.0156 -0.0015    0.0233 
1997-06-30    0.0212  0.0085    0.0217 
> functionTest(data = edhec, window = 3, slide = 2) 
Show Traceback 

Rerun with Debug 
Error in start_extend:end_extend : NA/NaN argument 
> 

UPDATE:

代碼現在運行以下更新到if語句(感謝Joshua Ulrich)(見下面的代碼)。然而,if語句仍然存在問題 - 無論數據集中是否存在偶數或奇數年,它都會運行。雖然這不影響函數的準確性,但考慮到大數據集可能會出現問題。如果有人對此有任何想法,將不勝感激。否則,這已經超級了!乾杯

if(is.even(nyears_t == FALSE)) { 
    nyears_t <- roundUp(nyears_t) 
    data_extend <- data 

    start_extend <- .indexyear(data)[nrow(data)] + 1900 + 1 
    end_extend <- start_extend + nyears(data) - 1 

    dates <- index(data) 
    tmp <- as.POSIXlt(dates) 
    tmp$year <- tmp$year + nyears(data) 
    dates2 <- as.POSIXct(tmp, tz = tz) 
    index(data_extend) <- dates2 

    data <- rbind(data, data_extend) 

    warning("WARNING! The function has looped to the start of the timeseries. The final list(s) 
      will contain years that do not exist in the dataset. Please modify.") 
    } 

回答

2

上以矩陣調用length(這是XTS /動物園對象的coredata是)給你元素的總數量(即底層矢量的長度)。您應該改用nrow

start_extend <- .indexyear(data)[nrow(data)] + 1900 + 1 
end_extend <- start_extend + nrow(data) - 1 

如果你不知道data是否將是一個矩陣或向量,那麼你應該使用NROW,而不是nrow。在向量上調用nrow返回NULLNROW將返回length(x)如果x是向量。

+0

謝謝@Joshua,我正在梳理這個函數時發現了這個錯誤。我還注意到爲了正確運行代碼,我需要做一些進一步的修改(我已經將它添加到了問題中)。 這現在似乎已經按預期運行了代碼。然而,仍然有一個小小的障礙。看起來,if語句現在正在穩定運行......我將edhec更改爲奇數和偶數,並且if語句總是被應用。這表明if語句中仍然存在根本性錯誤。 – Visser

+0

@Visser:關於你的更新,我認爲你需要'if(!is。即使(nyears_t))'。 –

0

我已經想出了具有所需效果的完整答案。感謝@Joshua的幫助 - 我不認爲如果沒有它,我可以修復代碼。爲了在大數據上運行它,我必須做一些額外的改變。

感興趣的緣故,這是我的全部工作的代碼(減去我的其他自定義功能):

bootOffset <- function(data, window, slide, tz = "GMT"){ 

    nyears_t = nyears(data) 

    #IF statement for non-even numbers only 
    if(is.even(nyears_t) == FALSE) { 
    nyears_t <- roundUp(nyears_t) 
    data_extend <- data 

    start_extend <- .indexyear(data)[nrow(data)] + 1900 + 1 
    end_extend <- start_extend + nyears(data) - 1 

    dates <- index(data) 
     tmp <- as.POSIXlt(dates); tmp$year <- tmp$year + nyears(data) 
    dates2 <- as.POSIXct(tmp, tz = tz) 

    index(data_extend) <- dates2 
    data <- rbind(data, data_extend) 
    } 

    nslides = nyears_t/slide 

    year_1 = (.indexyear(data)[1] + 1900) 

    #Matrix 
    start <- seq(from = year_1, by = slide, length.out = nslides); end <- start + window - 1 
    mat <- matrix(c(start, end), ncol = 2, dimnames = list(c(1:nslides), c("start", "end"))) 

    #For loop 
    subsetlist <- vector('list') 

    for(i in 1:nslides){ 
    subset <- window(data, 
        start = as.POSIXct(paste0(mat[i,1], "-01-01")), 
        end = as.POSIXct(paste0(mat[i,2], "-12-31"))) 

    subsetlist[[i]] <- subset 
    } 
    print(subsetlist) 
} 

並確認,這些結果出來爲期望:

data(edhec, package = "PerformanceAnalytics") 
edhec <- edhec[,1:3] 
edhec08 <- edhec["/2008"] 
edhec07 <- edhec["/2007"] 

bootOffset(data = edhec08, #EVEN 
        window = 4, 
        slide = 3) 

bootOffset(data = edhec07, #ODD 
        window = 4, 
        slide = 3) 
> bootOffset.Check <- function(boot){ 
+ dates <- lapply(boot, year) 
+ dates <- lapply(dates, unique) 
+ dates <- lapply(dates, `length<-`, max(lengths(dates))) 
+ as.data.frame(dates, 
+ col.names = paste0("boot_", 1:length(boot))) 
+ 
+ } 
> 
> nyears(edhec08) 
[1] 12 
> bootOffset.Check(boot08) #EVEN number of years 
    boot_1 boot_2 boot_3 boot_4 
1 1997 2000 2003 2006 
2 1998 2001 2004 2007 
3 1999 2002 2005 2008 
4 2000 2003 2006  NA 
> 
> nyears(edhec07) 
[1] 11 
> bootOffset.Check(boot07) #ODD number of years 
    boot_1 boot_2 boot_3 boot_4 
1 1997 2000 2003 2006 
2 1998 2001 2004 2007 
3 1999 2002 2005 2008 
4 2000 2003 2006 2009 
>