2016-11-03 40 views
0

多虧了這個帖子regarding the failure of stepwise variable selection in lm如何設置Step包的門檻?

我有實例中的數據看起來就像是在該職位描述如下

set.seed(1)   # for reproducible example 
x <- sample(1:500,500) # need this so predictors are not perfectly correlated. 
x <- matrix(x,nc=5) # 100 rows, 5 cols 
y <- 1+ 3*x[,1]+2*x[,2]+4*x[,5]+rnorm(100) # y depends on variables 1, 2, 5 only 

# you start here... 
df <- data.frame(y,as.matrix(x)) 
full.model <- lm(y ~ ., df)     # include all predictors 
step(full.model,direction="backward") 

我需要的是隻選擇5個最好的變量,然後6最好的變量出來的這些20,有沒有人知道如何使這種聯繫?

回答

0

MuMIn::dredge()可以選擇關於術語數量的限制。
[注意]:組合的數量,所需的時間,隨着預測變量的數量呈指數增長。

set.seed(1)   # for reproducible example 
x <- sample(100*20) 
x <- matrix(x, nc = 20)  # 20 predictor 
y <- 1 + 2*x[,1] + 3*x[,2] + 4*x[,3] + 5*x[,7] + 6*x[,8] + 7*x[,9] + rnorm(100) # y depends on variables 1,2,3,7,8,9 only 

df <- data.frame(y, as.matrix(x)) 
full.model <- lm(y ~ ., df)     # include all predictors 

library(MuMIn) 

# options(na.action = "na.fail")  # trace = 2: a progress bar is displayed 
dredge(full.model, m.lim = c(5, 5), trace = 2)   # result: x2, x3, x7, x8, x9