加速WMA（加權移動平均線）的計算

我試圖計算15天線條上的指數移動平均線，但希望看到15日線條EMA在每個（尾）日/線條上的「演變」。所以，這意味着我有15天的酒吧。當每天都有新的數據出現時，我想使用新的信息重新計算EMA。其實我有15天的酒吧，然後，每天過後，我新的15天酒吧開始增長，每個新酒吧應該用於EMA計算以及以前的整整15天的酒吧。加速WMA（加權移動平均線）的計算

假設我們從2012-01-01開始（本例中爲每個日曆日的數據），在2012-01-15年底，我們有第一個完整的15天吧。我們可以在2012-03-01完成4個完整的15天的條形圖後，開始計算4 bar EMA（EMA（x，n = 4））。在2012年3月2日結束時，我們將使用我們目前爲止的信息，並在2012-03-02計算EMA，假設2012-03-02的OHLC是15天的酒吧進行中。所以我們在2012-03-02拿4個完整的酒吧和酒吧並且計算EMA（x，n = 4）。然後，我們再等一天，看看正在進行的新的15天工作條發生了什麼（有關詳細信息，請參閱下面的函數to.period.cumulative），並計算EMA的新值......並且在接下來的15天之後... ...請參閱功能EMA.cumulative以下詳細...

下面請找到我能夠想出到現在。這種表現對我來說是不可接受的，而且我對有限的R知識無法做得更快。

library(quantmod) 

do.call.rbind <- function(lst) { 
    while(length(lst) > 1) { 
     idxlst <- seq(from=1, to=length(lst), by=2) 

     lst <- lapply(idxlst, function(i) { 
        if(i==length(lst)) { return(lst[[i]]) } 

        return(rbind(lst[[i]], lst[[i+1]])) 
       }) 
    } 
    lst[[1]] 
} 

to.period.cumulative <- function(x, name=NULL, period="days", numPeriods=15) { 
    if(is.null(name)) 
     name <- deparse(substitute(x)) 

    cnames <- c("Open", "High", "Low", "Close") 
    if (has.Vo(x)) 
     cnames <- c(cnames, "Volume") 

    cnames <- paste(name, cnames, sep=".") 

    if (quantmod:::is.OHLCV(x)) { 
     x <- OHLCV(x) 
     out <- do.call.rbind( 
       lapply(split(x, f=period, k=numPeriods), 
         function(x) cbind(rep(first(x[,1]), NROW(x[,1])), 
           cummax(x[,2]), cummin(x[,3]), x[,4], cumsum(x[,5])))) 
    } else if (quantmod:::is.OHLC(x)) { 
     x <- OHLC(x) 
     out <- do.call.rbind( 
       lapply(split(x, f=period, k=numPeriods), 
         function(x) cbind(rep(first(x[,1]), NROW(x[,1])), 
           cummax(x[,2]), cummin(x[,3]), x[,4]))) 
    } else { 
     stop("Object does not have OHLC(V).") 
    } 

    colnames(out) <- cnames 

    return(out) 
} 

EMA.cumulative<-function(cumulativeBars, nEMA = 4, period="days", numPeriods=15) { 
    barsEndptCl <- Cl(cumulativeBars[endpoints(cumulativeBars, on=period,  k=numPeriods)]) 

    # TODO: This is sloooooooooooooooooow... 
    outEMA <- do.call.rbind(
      lapply(split(Cl(cumulativeBars), period), 
        function(x) { 
         previousFullBars <- barsEndptCl[index(barsEndptCl) < last(index(x)), ] 
         if (NROW(previousFullBars) >= (nEMA - 1)) { 
           last(EMA(last(rbind(previousFullBars, x), n=(nEMA + 1)), n=nEMA)) 
         } else { 
          xts(NA, order.by=index(x)) 
         } 
        })) 

    colnames(outEMA) <- paste("EMA", nEMA, sep="") 

    return(outEMA) 
} 

getSymbols("SPY", from="2010-01-01") 

SPY.cumulative <- to.period.cumulative(SPY, , name="SPY") 

system.time(
     SPY.EMA <- EMA.cumulative(SPY.cumulative) 
)

在我的系統需要

user system elapsed 
    4.708 0.000 4.410

可接受的執行時間將超過1秒，少...是否有可能實現這一目標使用純的R？

這篇文章鏈接到Optimize moving averages calculation - is it possible?我沒有收到答案。我現在能夠創建一個可重現的例子，並更詳細地解釋我想加速的內容。我希望這個問題現在更有意義。

任何想法如何加快這一點高度讚賞。

來源

2012-01-03 Samo

嗯，那麼我有一個問題。可以說我們每天都有數據。我想以15天爲單位/小節計算4小時EMA（EMA（x，n = 4））。因此，使用to.period將日常數據轉換爲15天的酒吧。那很簡單。我想得到的是我想每天在15天酒吧看到4天EMA的發展。就像您希望隨着新數據持續進入EMA的實時圖形（附近）一樣。您將最後一次已知數據視爲完整的15天條（即使它僅僅是3天「舊」）。然後你把你現在知道的和以前所有完整的15天酒吧和EMA計算出來。好點？ – Samo 2012-01-03 23:51:29

約書亞，謝謝你的好意。只是爲了讓你意識到邊界和開始條件：我是一個兼職的無利可圖的零售交易者/程序員與一個小交易賬戶，使之成爲一個業餘愛好（或編程練習）誰選擇R作爲支持我的交易平臺（好吧，只有backtesting實際上）活動。我沒有爲任何法律實體的商業目的而開發此項服務。我非常感謝您在空閒時間提供的所有支持和所有支持。如果我沒有「免費」的其他想法，那麼我肯定會接受你的好意。 – Samo 2012-01-04 00:10:35

約書亞，這個沒有收入，對不起。感謝您推動我學習如何在R中使用C.感謝TTR中的C和Fortran代碼。 – Samo 2012-01-15 21:37:53

我還沒有找到一個令人滿意的解決方案，我的問題使用R.所以我採用了舊的工具，c語言，結果比我預想的要好。感謝您使用Rcpp的這些優秀工具「推」我，內聯等。令人驚歎。我想，只要我將來有性能需求，並且無法使用R來滿足，我會將C添加到R，並且性能在那裏。所以，請看下面我的代碼和解決性能問題。

# How to speedup cumulative EMA calculation 
# 
############################################################################### 

library(quantmod) 
library(Rcpp) 
library(inline) 
library(rbenchmark) 

do.call.rbind <- function(lst) { 
    while(length(lst) > 1) { 
     idxlst <- seq(from=1, to=length(lst), by=2) 

     lst <- lapply(idxlst, function(i) { 
        if(i==length(lst)) { return(lst[[i]]) } 

        return(rbind(lst[[i]], lst[[i+1]])) 
       }) 
    } 
    lst[[1]] 
} 

to.period.cumulative <- function(x, name=NULL, period="days", numPeriods=15) { 
    if(is.null(name)) 
     name <- deparse(substitute(x)) 

    cnames <- c("Open", "High", "Low", "Close") 
    if (has.Vo(x)) 
     cnames <- c(cnames, "Volume") 

    cnames <- paste(name, cnames, sep=".") 

    if (quantmod:::is.OHLCV(x)) { 
     x <- quantmod:::OHLCV(x) 
     out <- do.call.rbind( 
       lapply(split(x, f=period, k=numPeriods), 
         function(x) cbind(rep(first(x[,1]), NROW(x[,1])), 
           cummax(x[,2]), cummin(x[,3]), x[,4], cumsum(x[,5])))) 
    } else if (quantmod:::is.OHLC(x)) { 
     x <- OHLC(x) 
     out <- do.call.rbind( 
       lapply(split(x, f=period, k=numPeriods), 
         function(x) cbind(rep(first(x[,1]), NROW(x[,1])), 
           cummax(x[,2]), cummin(x[,3]), x[,4]))) 
    } else { 
     stop("Object does not have OHLC(V).") 
    } 

    colnames(out) <- cnames 

    return(out) 
} 

EMA.cumulative<-function(cumulativeBars, nEMA = 4, period="days", numPeriods=15) { 
    barsEndptCl <- Cl(cumulativeBars[endpoints(cumulativeBars, on=period, k=numPeriods)]) 

    # TODO: This is sloooooooooooooooooow... 
    outEMA <- do.call.rbind(
      lapply(split(Cl(cumulativeBars), period), 
        function(x) { 
         previousFullBars <- barsEndptCl[index(barsEndptCl) < last(index(x)), ] 
         if (NROW(previousFullBars) >= (nEMA - 1)) { 
           last(EMA(last(rbind(previousFullBars, x), n=(nEMA + 1)), n=nEMA)) 
         } else { 
          xts(NA, order.by=index(x)) 
         } 
        })) 

    colnames(outEMA) <- paste("EMA", nEMA, sep="") 

    return(outEMA) 
} 

EMA.c.c.code <- ' 
    /* Initalize loop and PROTECT counters */ 
    int i, P=0; 

    /* ensure that cumbars and fullbarsrep is double */ 
    if(TYPEOF(cumbars) != REALSXP) { 
     PROTECT(cumbars = coerceVector(cumbars, REALSXP)); P++; 
    } 

    /* Pointers to function arguments */ 
    double *d_cumbars = REAL(cumbars); 
    int i_nper = asInteger(nperiod); 
    int i_n = asInteger(n); 
    double d_ratio = asReal(ratio); 

    /* Input object length */ 
    int nr = nrows(cumbars); 

    /* Initalize result R object */ 
    SEXP result; 
    PROTECT(result = allocVector(REALSXP,nr)); P++; 
    double *d_result = REAL(result); 

    /* Find first non-NA input value */ 
    int beg = i_n*i_nper - 1; 
    d_result[beg] = 0; 
    for(i = 0; i <= beg; i++) { 
     /* Account for leading NAs in input */ 
     if(ISNA(d_cumbars[i])) { 
      d_result[i] = NA_REAL; 
      beg++; 
      d_result[beg] = 0; 
      continue; 
     } 
     /* Set leading NAs in output */ 
     if(i < beg) { 
      d_result[i] = NA_REAL; 
     } 
     /* Raw mean to start EMA - but only on full bars*/ 
     if ((i != 0) && (i%i_nper == (i_nper - 1))) { 
      d_result[beg] += d_cumbars[i]/i_n; 
     } 
    } 

    /* Loop over non-NA input values */ 
    int i_lookback = 0; 
    for(i = beg+1; i < nr; i++) { 
     i_lookback = i%i_nper; 

     if (i_lookback == 0) { 
      i_lookback = 1; 
     } 
     /*Previous result should be based only on full bars*/ 
     d_result[i] = d_cumbars[i] * d_ratio + d_result[i-i_lookback] * (1-d_ratio); 
    } 

    /* UNPROTECT R objects and return result */ 
    UNPROTECT(P); 
    return(result); 
' 

EMA.c.c <- cfunction(signature(cumbars="numeric", nperiod="numeric", n="numeric",  ratio="numeric"), EMA.c.c.code) 

EMA.cumulative.c<-function(cumulativeBars, nEMA = 4, period="days", numPeriods=15) { 
    ratio <- 2/(nEMA+1) 

    outEMA <- EMA.c.c(cumbars=Cl(cumulativeBars), nperiod=numPeriods, n=nEMA, ratio=ratio) 

    outEMA <- reclass(outEMA, Cl(cumulativeBars)) 

    colnames(outEMA) <- paste("EMA", nEMA, sep="") 

    return(outEMA) 
} 

getSymbols("SPY", from="2010-01-01") 

SPY.cumulative <- to.period.cumulative(SPY, name="SPY") 

system.time(
     SPY.EMA <- EMA.cumulative(SPY.cumulative) 
) 

system.time(
     SPY.EMA.c <- EMA.cumulative.c(SPY.cumulative) 
) 


res <- benchmark(EMA.cumulative(SPY.cumulative), EMA.cumulative.c(SPY.cumulative), 
     columns=c("test", "replications", "elapsed", "relative", "user.self", "sys.self"), 
     order="relative", 
     replications=10) 

print(res)

編輯：爲了讓過我麻煩的性能改進的指示（我相信它可以變得更好，因爲實際上我已經創建了雙重的for循環）這裏R是打印出來：

> print(res) 
           test replications elapsed relative user.self 
2 EMA.cumulative.c(SPY.cumulative)   10 0.026 1.000  0.024 
1 EMA.cumulative(SPY.cumulative)   10 57.732 2220.462 56.755

因此，按照我的標準，SF類型的改善...

來源

2012-01-15 21:36:08 Samo

感謝您分享此代碼並展示C等的實用程序。您對示例的時間有何評論？即'benchmark（）'調用的輸出是什麼？ – Iterator 2012-01-18 13:41:43

我對性能改進感到興奮（請參閱編輯後）。這是可以預料的，因爲在我的R代碼（由註釋＃TODO表示：這是sloooooooooooooooooow ...）我已經有效地創建了一個使用rbind和lapply的double循環。但是我的R技能是基本能夠使用R恢復到C的性能提高...... – Samo 2012-01-21 12:14:51

加速WMA（加權移動平均線）的計算

回答

相關問題