差異從RcppArmadillo

下面是一個例子：差異從RcppArmadillo

require(Rcpp) 
require(RcppArmadillo) 
require(zoo) 
require(repmis) 

myData <- source_DropboxData(file = "example.csv", 
           key = "cbrmkkbssu5bn96", sep = ",", header = TRUE) 

dolm = function(x) coef(fastLmPure(as.matrix(x[,2]), x[,1])) 

myCoef = rollapply(myData, 260, dolm, by.column = FALSE) 

summary(myCoef) # 80923 NA's 

dolm2 = function(x) coef(fastLm(x[,1] ~ x[,2] + 0, data = as.data.frame(x))) 

myCoef2 = rollapply(myData, 260, dolm2, by.column = FALSE) 

summary(myCoef2) # 0 NA's

在上述第一種方法中例如用fastLmPure產生的NA在輸出，同時用fastLm秒方法沒有。

這裏是鏈接到寫入中的R fastLm & fastLmPure功能：

https://github.com/RcppCore/RcppArmadillo/blob/master/R/fastLm.R

而這裏是鏈接到底層fastLm功能用C++編寫：

https://github.com/RcppCore/RcppArmadillo/blob/master/src/fastLm.cpp

從這些鏈接和RcppArmadillo的文檔，對我來說不明顯是什麼導致輸出差異TS？爲什麼在第二次輸出中沒有新手？最重要的問題是，例程/部分代碼在第二種方法中阻止了NAs的出現，以及它如何實現？

來源

2015-11-07 Bobby Digital

我會邀請你[閱讀源代碼]（https://github.com/RcppCore/RcppArmadillo/blob/master/R/fastLm.R）。這真的不是那麼複雜... –

@DirkEddelbuettel謝謝，我發現了這個問題。 PS。但是給出問題中已經提供的鏈接是非常有幫助的;） –

您正在致電兩種不同的功能與兩種不同的接口。

特別地，fastLm()經由式y ~ X用於將依靠在R內部（和慢!!）函數創建你對應於fastLm(X, y)一個向量和矩陣時。

下面是一個簡單的例子進行設置了：

R> data(mtcars) 
R> lm(mpg ~ cyl + disp + hp + wt - 1, data=mtcars) 

Call: 
lm(formula = mpg ~ cyl + disp + hp + wt - 1, data = mtcars) 

Coefficients: 
    cyl  disp  hp  wt 
5.3560 -0.1206 -0.0313 5.6913 

R> fastLm(mpg ~ cyl + disp + hp + wt - 1, data=mtcars) 

Call: 
fastLm.formula(formula = mpg ~ cyl + disp + hp + wt - 1, data = mtcars) 

Coefficients: 
     cyl  disp  hp  wt 
5.356014 -0.120609 -0.031306 5.691273 
R> fastLm(mtcars[, c("cyl","disp","hp","wt")], mtcars[,"mpg"]) 

Call: 
fastLm.default(X = mtcars[, c("cyl", "disp", "hp", "wt")], y = mtcars[, 
    "mpg"]) 

Coefficients: 
     cyl  disp  hp  wt 
5.356014 -0.120609 -0.031306 5.691273 
R>

現在，讓我們在這兩個左手與右手邊添加NA。爲了便於索引，我們將使用一整行：

R> mtcars[7, ] <- NA 
R> lm(mpg ~ cyl + disp + hp + wt - 1, data=mtcars) 

Call: 
lm(formula = mpg ~ cyl + disp + hp + wt - 1, data = mtcars) 

Coefficients: 
    cyl  disp  hp  wt 
5.3501 -0.1215 -0.0332 5.8281 

R> fastLm(mpg ~ cyl + disp + hp + wt - 1, data=mtcars) 

Call: 
fastLm.formula(formula = mpg ~ cyl + disp + hp + wt - 1, data = mtcars) 

Coefficients: 
     cyl  disp  hp  wt 
5.350102 -0.121478 -0.033184 5.828065 
R> fastLm(na.omit(mtcars[, c("cyl","disp","hp","wt")]), na.omit(mtcars[,"mpg"])) 

Call: 
fastLm.default(X = na.omit(mtcars[, c("cyl", "disp", "hp", "wt")]), 
    y = na.omit(mtcars[, "mpg"])) 

Coefficients: 
     cyl  disp  hp  wt 
5.350102 -0.121478 -0.033184 5.828065 
R>

這裏是踢球，但結果仍然提供我們看好遺漏值一致的所有方法之間的相同。

來源

2015-11-07 21:24:01

差異從RcppArmadillo

回答

相關問題