2016-12-06 20 views
0

我有下面的總結輸出,並且我想從選定變量(僅變量名稱= X10)中提取結果。如何從穩定性選擇模型輸出中提取特定值

> stab.glmnet 
    Stability Selection with unimodality assumption 

Selected variables: 
X10 
10 

Selection probabilities: 
X2 X1 X7 X3 X6 X4 X5 X8 X9 X10 
0.02 0.06 0.20 0.22 0.25 0.32 0.35 0.37 0.41 1.00 

--- 
Cutoff: 0.75; q: 3; PFER (*): 0.918 
(*) or expected number of low selection probability variables 
PFER (specified upper bound): 1 
PFER corresponds to signif. level 0.0918 (without multiplicity adjustment) 

我試圖(下文),但它給我只其他值,它是10

var <- stab.glmnet$selected[[1]] 

數據:

set.seed(1001) 
n <- 100 
Y <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0, 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0) 
X1 <- sample(x=c(0,1,2), size=n, replace=TRUE, prob=c(0.1,0.4,0.5)) 
X2 <- sample(x=c(0,1,2), size=n, replace=TRUE, prob=c(0.5,0.25,0.25)) 
X3 <- sample(x=c(0,1,2), size=n, replace=TRUE, prob=c(0.3,0.4,0.4)) 
X4 <- sample(x=c(0,1,2), size=n, replace=TRUE, prob=c(0.35,0.35,0.3)) 
X5 <- sample(x=c(0,1,2), size=n, replace=TRUE, prob=c(0.1,0.2,0.7)) 
X6 <- sample(x=c(0,1,2), size=n, replace=TRUE, prob=c(0.8,0.1,0.1)) 
X7 <- sample(x=c(0,1,2), size=n, replace=TRUE, prob=c(0.1,0.1,0.8)) 
X8 <- sample(x=c(0,1,2), size=n, replace=TRUE, prob=c(0.35,0.35,0.3)) 
X9 <- sample(x=c(0,1,2), size=n, replace=TRUE, prob=c(0.35,0.35,0.3)) 
X10 <- c(2,2,2,2,2,2,2,2,2,2,1,2,2,2,1,2,2,1,2,2,2,2,2,2,2,2,2,2,1,2,2,2,2, 
2,1,2,1,1,2,1,2,1,2,1,2,1,1,2,1,2,0,0,0,0,0,0,0,0,1,0, 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1, 
1,1,1,1,1,0,0,0,0,0,0,0,1,0,0,0,0) 

datasim <- data.frame(Y=as.factor(Y),X1,X2,X3,X4,X5,X6,X7,X8,X9,X10) 

的包和穩定性選擇代碼

library("stabs") 
library("glmnet") 
x <- model.matrix(Y~.,datasim)[,-1] 
y <- datasim$Y 
y <- as.numeric(y) 

stab.glmnet <- stabsel(x,y ,fitfun = glmnet.lasso, cutoff = 0.75,PFER = 1) 
+0

你用哪個包和功能來創建模型?更一般地說,如果你提供一個可重複的例子,你會讓人們更容易幫助你。 – eipi10

+0

謝謝@ eipi10,我更新了上面的問題 – Shima

+0

你不說你想從摘要中提取什麼。你想要選擇變量的選擇概率還是其他? – eipi10

回答

3

我收到一個錯誤,當我運行你的代碼。無論如何,如果您查看print.stabsel函數,您可以看到每個摘要輸出位於模型對象中的哪個位置。 print.stabsel的代碼粘貼在下面。

例如,如果您想要選擇變量的選擇概率,您可以看到所選變量的索引可在stab.glmnet$selected中找到。選擇概率在stab.glmnet$max。因此,我們可以做到以下幾點:

stab.glmnet$max[stab.glmnet$selected] 

看到的是在模型對象還有什麼,看看str(stab.glmnet)輸出。

代碼print.stabsel

getAnywhere(print.stabsel) 
function (x, decreasing = FALSE, print.all = TRUE, ...) 
{ 
    cat("\tStability Selection") 
    if (x$assumption == "none") 
     cat(" without further assumptions\n") 
    if (x$assumption == "unimodal") 
     cat(" with unimodality assumption\n") 
    if (x$assumption == "r-concave") 
     cat(" with r-concavity assumption\n") 
    if (length(x$selected) > 0) { 
     cat("\nSelected variables:\n") 
     print(x$selected) 
    } 
    else { 
     cat("\nNo variables selected\n") 
    } 
    cat("\nSelection probabilities:\n") 
    if (print.all) { 
     print(sort(x$max, decreasing = decreasing)) 
    } 
    else { 
     print(sort(x$max[x$max > 0], decreasing = decreasing)) 
    } 
    cat("\n---\n") 
    print.stabsel_parameters(x, heading = FALSE) 
    cat("\n") 
    invisible(x) 
} 
+0

謝謝@ eipi10的回答。非常有用。我錯過了as.numeric(y)。對不起。我只想要所選變量的名稱。所以我嘗試了jota的建議(上面)名稱(stab.glmnet $ selected [1]),它給了我我想要的,只有變量名稱「X10」。 – Shima