在運行glm()之前,可以簡單地從mydata中排除這三個變量。
在這裏,我創建一些示例數據:
set.seed(1)
mydata<-replicate(10,rnorm(100,300,50))
mydata<-data.frame(dv=sample(c(0,1),100,replace = TRUE),mydata)
> head(mydata)
dv X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 1 268.6773 268.9817 320.4701 344.6837 353.7220 303.8652 282.9467 264.6216 245.6546 222.9299
2 1 309.1822 302.1058 384.4437 247.6351 394.7827 285.1566 375.1212 398.5786 208.6958 309.7161
3 1 258.2186 254.4539 379.3294 398.5669 269.8501 240.8379 326.4154 295.5001 349.7641 313.2211
4 0 379.7640 307.9014 283.4546 280.8184 280.4566 300.5646 327.1096 299.2991 299.4069 244.0632
5 0 316.4754 267.2708 185.7382 382.7073 279.1889 349.5801 293.1663 243.8272 270.0186 332.5476
6 0 258.9766 388.3644 424.8831 375.6106 281.2171 379.6984 243.1633 232.7935 291.1026 248.3550
如果我運行上的數據,指定的模型,因爲它是那麼我用右手側的所有變量:
model<-glm(data=mydata, dv~.,family=binomial(link = 'logit'))
> summary(model)
Call:
glm(formula = dv ~ ., family = binomial(link = "logit"), data = mydata)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.8891 -1.0853 -0.5163 1.0237 1.8303
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.4330825 4.1437180 -0.587 0.5571
X1 -0.0020482 0.0049025 -0.418 0.6761
X2 -0.0059021 0.0046298 -1.275 0.2024
X3 0..0047991 2.568 0.0102 *
X4 0.0024804 0.0046856 0.529 0.5966
X5 0.0025348 0.0039545 0.641 0.5215
X6 -0.0005905 0.0047417 -0.125 0.9009
X7 -0.0001758 0.0040737 -0.043 0.9656
X8 0.0042362 0.0041170 1.029 0.3035
X9 -0.0007664 0.0042471 -0.180 0.8568
X10 -0.0042089 0.0043094 -0.977 0.3287
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 138.59 on 99 degrees of freedom
Residual deviance: 125.11 on 89 degrees of freedom
AIC: 147.11
Number of Fisher Scoring iterations: 4
現在我排除MYDATA X1和X2,然後再次運行模式:
mydata2<-mydata[,-match(c('X1','X2'), colnames(mydata))]
model2<-glm(data=mydata2, dv~.,family=binomial(link = 'logit'))
> summary(model2)
Call:
glm(formula = dv ~ ., family = binomial(link = "logit"), data = mydata2)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.8983 -1.0724 -0.4521 1.1132 1.7792
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.8725545 3.6357314 -1.340 0.18019
X3 0.0124982 0.0047930 2.608 0.00912 **
X4 0.0031911 0.0045971 0.694 0.48758
X5 0.0015992 0.0038101 0.420 0.67467
X6 -0.0003295 0.0046554 -0.071 0.94357
X7 0.0003372 0.0039961 0.084 0.93275
X8 0.0038889 0.0040737 0.955 0.33977
X9 -0.0010014 0.0042078 -0.238 0.81189
X10 -0.0041691 0.0042232 -0.987 0.32356
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 138.59 on 99 degrees of freedom
Residual deviance: 126.93 on 91 degrees of freedom
AIC: 144.93
Number of Fisher Scoring iterations: 4