2015-04-16 25 views
0

這篇文章是關於使用Rcpp包加速R代碼以避免遞歸循環。使用Rcpp的R代碼的效率和速度

我的輸入是通過以下實施例(長度爲7),這是我使用了data.frame(長度51673)的一部分限定:

S=c(906.65,906.65,906.65,906.65,906.65,906.65,906.65) 
T=c(0.1371253,0.1457896,0.1248953,0.1261278,0.1156931,0.0985253,0.1332596) 
r=c(0.013975,0.013975,0.013975,0.013975,0.013975,0.013975,0.013975)    
h=c(0.001332596,0.001248470,0.001251458,0.001242143,0.001257921,0.0,0.0)  
P=c(3,1,5,2,1,4,2) 
A= data.frame(S=S,T=T,r=r,h=h,P=P) 

     S   T  r   h Per 
1 906.65 0.1971253 0.013975 0.001332596 3 
2 906.65 0.1971253 0.013975 0.001248470 1 
3 906.65 0.1971253 0.013975 0.001251458 5 
4 906.65 0.1971253 0.013975 0.001242143 2 
5 906.65 0.1971253 0.013975 0.001257921 1 
6 906.65 0.1971253 0.013975 0.0
7 906.65 0.1971253 0.013975 0.0

的參數是:

w=0.001; b=0.2; a=0.0154; c=0.0000052; neta=-0.70 

我有功能的下面的代碼,我想用:

F<-function(x,w,b,a,c,neta,S,T,r,P){ 
    u=1i*x 
    nu=(1/(neta^2))*(((1-2*neta)^(1/2))-1) 
    # Recursion back to time t 
    # Terminal condition for the A and B 
    A_Q=0 
    B_Q=0 
    steps<-round(T*250,0) 

    for (j in 1:steps){ 
     A_Q= A_Q+ r*u + w*B_Q-(1/2)*log(1-2*a*(neta^4)*B_Q) 
     B_Q= b*B_Q+u*nu+ (1/neta^2)*(1-sqrt((1-2*a*(neta^4)*B_Q)*(1- 2*c*B_Q - 2*u*neta))) 
    } 
    F= exp(log(S)*u + A_Q + B_Q*h[P]) 
    return(F) 
} 

S = A$S ; r= A$r ; T= A$T ; P=A$P; h= A$h 

然後我想申請的前僱主用我在data.set的OU功能的長度的矢量N = 100000:

Z=length(S); N=100000 ; alpha=2 ; delta= 0.25  
    lambda=(2*pi)/(N*delta) 

res = matrix(nrow=N, ncol=Z) 
    for (i in 1:N){ 
    for (j in 1:Z){ 
     res[i,j]= Re(F(((delta*(i-1))-(alpha+1)*1i),w,b,a,c,neta,S[j],T[j],r[j],P[j])) 
    } 
    } 

但它服用大量的時間:它需要20秒來執行這行代碼對N = 100,但我想執行N = 100000次,整個運行時間可能需要幾小時。如何使用Rcpp微調上述代碼,減少執行時間並獲得高效的程序?

是否有可能減少執行時間,如果是這樣,請建議我一個解決方案,即使沒有Rcpp。

謝謝。其具有用於向量化的計算很大的支持 -

+0

對於計算x =(((delta *(i-1)) - (alpha + 1)* 1i)其中1i是合理的,r中的複數的定義是,i取1到N之間的值 –

回答

3

你的功能F可以通過在Armadillo library服用veccx_vec類的優點(通過RcppArmadillo包訪問)被轉換成C++很容易地。


#include <RcppArmadillo.h> 
// [[Rcpp::depends(RcppArmadillo)]] 

// [[Rcpp::export]] 
arma::cx_vec Fcpp(const arma::cx_vec& x, double w, double b, double a, double c, 
        double neta, const arma::vec& S, const arma::vec& T, 
        const arma::vec& r, Rcpp::IntegerVector P, Rcpp::NumericVector h) { 

    arma::cx_vec u = x * arma::cx_double(0.0,1.0); 

    double nu = (1.0/std::pow(neta,2.0)) * (std::sqrt(1.0-2.0*neta)-1.0); 
    arma::cx_vec A_Q(r.size()); 
    arma::cx_vec B_Q(r.size()); 
    arma::vec steps = arma::round(T*250.0); 

    for (size_t j = 0; j < steps.size(); j++) { 
    for (size_t k = 0; k < steps[j]; k++) { 
     A_Q = A_Q + r*u + w*B_Q - 
       0.5*arma::log(1.0 - 2.0*a*std::pow(neta,4.0)*B_Q); 
     B_Q = b*B_Q + u*nu + (1.0/std::pow(neta,2.0)) * 
       (1.0 - arma::sqrt((1.0 - 2.0*a*std::pow(neta,4.0)*B_Q) * 
       (1.0 - 2.0*c*B_Q - 2.0*u*neta))); 
    } 
    } 

    arma::vec hP = Rcpp::as<arma::vec>(h[P-1]); 
    arma::cx_vec F = arma::exp(arma::log(S)*u + A_Q + B_Q*hP); 

    return F; 
} 

只是一對夫婦的小的改動需要注意:

  • 我使用arma::功能矢量計算,如arma::logarma::exparma::roundarma::sqrt,以及各種超載運營商(*,+,-);但使用std::powstd::sqrt進行標量計算。在R中,這是從我們身上抽象出來的,但是在這裏我們必須區分這兩種情況。
  • 您的函數F有一個循環 - for (i in 1:steps) - 但C++版本有兩個,只是由於兩種語言之間的循環語義的差異。
  • 大多數輸入向量是arma::類(相對於使用Rcpp::NumericVectorRcpp::ComplexVector),例外是Ph,由於RCPP矢量提供R-狀元件接入 - 例如h[P-1]。另請注意,P需要被偏移1(C++中基於0的索引),然後使用Rcpp::as<arma::vec>轉換爲犰狳向量(hP),因爲如果您嘗試將cx_vecNumericVectorB_Q*hP)相乘,則編譯器會發出抱怨, 。
  • 我添加了一個函數參數h - 依靠全局變量h的存在並不是一個好主意,您在F中正在執行這個變量。如果你需要在函數體中使用它,你應該把它傳遞給函數。

我改變你的函數的名稱Fr,並使基準輕鬆一點,我剛剛結束的雙循環,填充基質res到功能FrFcpp

loop_Fr <- function(mat = res) { 
    for (i in 1:N) { 
    for (j in 1:Z) { 
     mat[i,j]= Re(Fr(((delta*(i-1))-(alpha+1)*1i),w,b,a,c,neta,S[j],T[j],r[j],P[j],h)) 
    } 
    } 
    return(mat) 
} 
loop_Fcpp <- function(mat = res) { 
    for (i in 1:N) { 
    for (j in 1:Z) { 
     mat[i,j]= Re(Fcpp(((delta*(i-1))-(alpha+1)*1i),w,b,a,c,neta,S[j],T[j],r[j],P[j],h)) 
    } 
    } 
    return(mat) 
} 
## 
R> all.equal(loop_Fr(),loop_Fcpp()) 
[1] TRUE 

我比較了N = 100,N = 1000N = 100000這兩個函數(永久佔用) - adjusti ng lambdares,但保持其他一切。一般來說,FcppFr我的電腦上約10倍的速度更快:

N <- 100 
lambda <- (2*pi)/(N*delta) 
res <- matrix(nrow=N, ncol=Z) 
## 
R> microbenchmark::microbenchmark(loop_Fr(), loop_Fcpp(),times=50L) 
Unit: milliseconds 
     expr  min  lq median  uq  max neval 
    loop_Fr() 142.44694 146.62848 148.97571 151.86318 186.67296 50 
loop_Fcpp() 14.72357 15.26384 15.58604 15.85076 20.19576 50 

N <- 1000 
lambda <- (2*pi)/(N*delta) 
res <- matrix(nrow=N, ncol=Z) 
## 
R> microbenchmark::microbenchmark(loop_Fr(), loop_Fcpp(),times=50L) 
Unit: milliseconds 
     expr  min  lq median  uq  max neval 
    loop_Fr() 1440.8277 1472.4429 1491.5577 1512.5636 1565.6914 50 
loop_Fcpp() 150.6538 153.2687 155.4156 158.0857 181.8452 50 

N <- 100000 
lambda <- (2*pi)/(N*delta) 
res <- matrix(nrow=N, ncol=Z) 
## 
R> microbenchmark::microbenchmark(loop_Fr(), loop_Fcpp(),times=2L) 
Unit: seconds 
     expr  min  lq median  uq  max neval 
    loop_Fr() 150.14978 150.14978 150.33752 150.52526 150.52526  2 
loop_Fcpp() 15.49946 15.49946 15.75321 16.00696 16.00696  2 

其他變量,如你的問題提出:

S <- c(906.65,906.65,906.65,906.65,906.65,906.65,906.65) 
T <- c(0.1371253,0.1457896,0.1248953,0.1261278,0.1156931,0.0985253,0.1332596) 
r <- c(0.013975,0.013975,0.013975,0.013975,0.013975,0.013975,0.013975)    
h <- c(0.001332596,0.001248470,0.001251458,0.001242143,0.001257921,0.0,0.0)  
P <- c(3,1,5,2,1,4,2) 
w <- 0.001; b <- 0.2; a <- 0.0154; c <- 0.0000052; neta <- (-0.70) 
Z <- length(S) 
alpha <- 2; delta <- 0.25