2016-11-22 48 views
1

我想運行加權數據的線性迴歸。
當使用speedlm時,當數據中存在缺失值時,我收到錯誤消息。運行speedlm加權數據與缺失值

library(speedglm) 
sampleData <- data.frame(w = round(runif(12,0,1)), 
          target = rnorm(12,100,50), 
          predictor = c(NA, rnorm(10, 40, 10),NA)) 

summary(sampleData) 
 w    target   predictor  
Min. :0.0000 Min. : -3.381 Min. :22.58 
1st Qu.:0.0000 1st Qu.: 48.321 1st Qu.:30.45 
Median :1.0000 Median : 84.156 Median :37.09 
Mean :0.5833 Mean : 92.306 Mean :35.03 
3rd Qu.:1.0000 3rd Qu.:119.891 3rd Qu.:41.96 
Max. :1.0000 Max. :223.896 Max. :43.48 
            NA's :2 
#run linear regression without weights 
linearNoWeights <- lm(formula("target~predictor"), data = sampleData) 
speedLinearNoWeights <- speedlm(formula("target~predictor"), data = sampleData) 

#run linear regression with weights 
linearWithWeights <- lm(formula("target~predictor"), data = sampleData, weights =sampleData[,"w"]) 
speedLinearWithWheights <- speedlm(formula("target~predictor"), data = sampleData, weights =sampleData[,"w"]) 
Error in base::crossprod(x, y) : non-conformable arguments 
In addition: Warning messages: 
1: In sqw * X : 
    longer object length is not a multiple of shorter object length 
2: In sqw * y : 
    longer object length is not a multiple of shorter object length 
Called from: base::crossprod(x, y) 

有沒有解決這個不逼我運行迴歸之前解決數據的任何方式?

+2

爲什麼您反對在擬合模型之前從數據集中移除這兩個觀察值? – Roland

+0

@Roland我在這裏展示的是一個例子,我實際上有很多數據框,而NA對於其餘的計算很重要 – eliavs

回答

1

您應該嘗試更改na.action選項。以下是您的代碼,當我將na.action更改爲na.exclude/na.omit時,我可以運行該代碼。

library(speedglm) 
sampleData <- data.frame(w = round(runif(12,0,1)), 
         target = rnorm(12,100,50), 
         predictor = c(NA, rnorm(10, 40, 10),NA)) 
summary(sampleData) 

linearNoWeights <- lm(formula("target~predictor"), data = sampleData) 
speedLinearNoWeights <- speedlm(formula("target~predictor"), data = sampleData) 

options(na.action="na.exclude") # or "na.omit" 

linearNoWeights <- lm(formula("target~predictor"), data = sampleData) 
    speedLinearNoWeights <- speedlm(formula("target~predictor"), data = sampleData) 

你可以通過文檔na.omitna.exclude瞭解什麼時候使用什麼。希望這可以幫助。