使用R從預先設定的迴歸模型中獲取殘差

我的目標是在去除異常值後獲得數據集及其相關殘差的線性迴歸模型。使用R從預先設定的迴歸模型中獲取殘差

使用「光圈」數據集來說明：

沒有觀察這種原始模型中移除

（MODEL1）

library(dplyr) 
library(magrittr) 
library(broom) 

    iris %>% 
    + do(tidy(lm(Sepal.Length ~ Sepal.Width + Petal.Length + Species, .))) 

       term estimate std.error statistic  p.value 
1  (Intercept) 2.3903891 0.26226815 9.114294 5.942826e-16 
2  Sepal.Width 0.4322172 0.08138982 5.310458 4.025982e-07 
3  Petal.Length 0.7756295 0.06424566 12.072869 1.151112e-23 
4 Speciesversicolor -0.9558123 0.21519853 -4.441537 1.759999e-05 
5 Speciesvirginica -1.3940979 0.28566053 -4.880261 2.759618e-06

但我想有些離羣重塑（基於.cooksd）刪除。即：

（MODEL2）

iris %>% 
+ do(augment(lm(Sepal.Length ~ Sepal.Width + Petal.Length + Species, .))) %>% 
+ filter(.cooksd < 0.03) %>% 
+ do(tidy(lm(Sepal.Length ~ Sepal.Width + Petal.Length + Species, .))) 


       term estimate std.error statistic  p.value 
1  (Intercept) 2.3927287 0.23718040 10.088223 2.875549e-18 
2  Sepal.Width 0.4150542 0.07374143 5.628508 9.775805e-08 
3  Petal.Length 0.8035635 0.05975821 13.446914 7.229176e-27 
4 Speciesversicolor -0.9858935 0.19651867 -5.016793 1.589618e-06 
5 Speciesvirginica -1.4841365 0.26399083 -5.621924 1.008374e-07

保存這些模型：

lm_model2 <- iris %>% 
    do(augment(lm(Sepal.Length ~ Sepal.Width + Petal.Length + Species, .))) %>% 
    filter(.cooksd < 0.03) %>% 
    lm(Sepal.Length ~ Sepal.Width + Petal.Length + Species, .) 


lm_model1 <- iris %>% 
    lm(Sepal.Length ~ Sepal.Width + Petal.Length + Species, .)

已經做到這一點，是有可能獲得基於第二模型的數據集的迴歸殘差。

我能想到的唯一解決方法是使用模型2的共同efficients計算這些間接的，即：

Residual = 2.3927287 + 0.4150542 * Sepal.Width + 0.8035635 * Petal.Length + [-0.9858935 * Speciesversicolor] or + [-1.4841365 * Speciesvirginica] - Sepal.Length

有沒有更好的辦法？類似於：

residuals <- obtain_residuals(iris, lm_model2)

非常感謝。

來源

2016-11-04 Tony2016

您是否保存了您的'lm'模型對象？ –

爲什麼不能用Sepal.Length - 預測（模型）??? –

我想這是暗示我的問題。 –

在42位'預測'建議的幫助下，我相信下面會有效。如果需要，它也可以變成一個功能。

iris %>% 
    do(augment(lm(Sepal.Length ~ Sepal.Width + Petal.Length + Species, .))) %>% 
    filter(.cooksd < 0.03) %>% 
    lm(Sepal.Length ~ Sepal.Width + Petal.Length + Species, na.action=na.exclude, data=.) %>% 
    predict(iris) %>% 
    cbind(predicted = ., iris) %>% 
    mutate(residual = Sepal.Length - predicted)

謝謝大家的幫助和建議。

來源

2016-11-04 04:50:29 Tony2016

我認爲你的tidy（）從lm中移除了很多正常輸出。

mylm<- iris %>% 
    do(augment(lm(Sepal.Length ~ Sepal.Width + Petal.Length + Species, .))) %>% 
    filter(.cooksd < 0.03) %>% 
    lm(Sepal.Length ~ Sepal.Width + Petal.Length + Species, .) 

head(mylm$residuals) 

      1   2   3   4   5   6 
0.12959260 0.13711970 -0.06553479 -0.28474207 -0.01191282 0.02250186

來源

2016-11-04 00:57:24 akaDrHouse

我認爲這不會計算從過濾器中排除的觀察值的殘差 – Tony2016

使用R從預先設定的迴歸模型中獲取殘差

回答

相關問題