2012-10-16 18 views
0

我正在使用效果數據包來構造一些概率圖,顯示邏輯迴歸模型中的預測概率。但是,我收到一條奇怪的錯誤消息,不知道發生了什麼問題是。使用效果包時出現「下標越界」

當我嘗試生成圖時,出現以下錯誤。警告不是問題,這是我不明白錯誤信息告訴我什麼。

library(effects)  

dat$won_ping = as.factor(dat$won_ping) 

mod2 = glm(won_ping ~ our_bid + 
    age_of_oldest_driver2 + 
    credit_type2 + 
    coverage_type2 + 
    home_owner2 + 
    vehicle_driver_score + 
    currently_insured2 + 
    zipcode2, 
    data=dat, family=binomial(link="logit")) 

> plot(effect("our_bid*vehicle_driver_score", mod2), rescale.axis=FALSE, multiline=TRUE) 
Warning message: 
In analyze.model(term, mod, xlevels, default.levels) : 
    our_bid:vehicle_driver_score does not appear in the model 
Error in plot(effect("our_bid*vehicle_driver_score", mod2), rescale.axis = FALSE, : 
    error in evaluating the argument 'x' in selecting a method for function 'plot': Error in apply(mod.matrix[, components], 1, prod) : 
    subscript out of bounds 

這裏是我的數據信息和我的命令GLM:

> str(dat) 
'data.frame': 85240 obs. of 71 variables: 
$ our_bid      : num 155 123 183 98 108 159 98 123 98 200 ... 
$ won_ping     : Factor w/ 2 levels "0","1": 1 1 2 1 1 1 1 1 1 1 ... 
$ zipcode2     : Factor w/ 4 levels "1:6999","10000:14849",..: 4 3 2 1 3 2 3 1 2 2 ... 
$ age_of_oldest_driver2  : Factor w/ 4 levels "18 to 21","22 to 25",..: NA 3 NA NA NA NA 3 NA 3 NA ... 
$ currently_insured2   : Factor w/ 2 levels "0","1": 2 1 2 2 1 1 2 2 1 1 ... 
$ credit_type2    : Ord.factor w/ 4 levels "POOR"<"FAIR"<..: 2 3 2 3 2 2 1 3 3 2 ... 
$ coverage_type2    : Factor w/ 4 levels "BASIC","MINIMUM",..: 4 3 3 3 3 3 3 3 4 3 ... 
$ home_owner2     : Factor w/ 2 levels "0","1": 1 2 2 2 2 2 2 2 2 2 ... 
$ vehicle_driver_score  : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ... 

最後,這裏可能是一些有用的信息:

> sessionInfo() 
R version 2.14.0 (2011-10-31) 
Platform: x86_64-pc-mingw32/x64 (64-bit) 

locale: 
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 
[4] LC_NUMERIC=C       LC_TIME=English_United States.1252  

attached base packages: 
[1] grid  stats  graphics grDevices utils  datasets methods base  

other attached packages: 
[1] effects_2.2-1 colorspace_1.1-1 nnet_7.3-1  MASS_7.3-16  lattice_0.20-0 foreign_0.8-46 

loaded via a namespace (and not attached): 
[1] tools_2.14.0 

幫助!什麼是錯誤信息的意思?通常情況下,如果「下標超出範圍」,這意味着我選擇的數據結構範圍之外的東西,但這根本不會發生。

編輯:

要@Rowland

正如我上面所說的,警告和錯誤消息是獨立的和不相關的。比方說,我拿出zipcode2和運行GLM:

mod2 = glm(won_ping ~ our_bid + 
    age_of_oldest_driver2 + 
    credit_type2 + 
    coverage_type2 + 
    home_owner2 + 
    vehicle_driver_score + 
    currently_insured2, 
    data=dat, family=binomial(link="logit")) 

> plot(effect("our_bid*home_owner2", mod2), rescale.axis=FALSE, multiline=TRUE) 
Warning message: 
In analyze.model(term, mod, xlevels, default.levels) : 
    our_bid:home_owner2 does not appear in the model 

這將產生只是警告,因爲我得到了想要的結果這是罰款。因此,「」沒有出現在模型中的事實不是問題,並且不會導致錯誤消息。

+0

你是對的,錯誤不直接從警告中跟蹤。不過,我仍然認爲他們可能有關係。如果包含交互,你是否能夠適應GLM? – Roland

+0

是的,當我添加交互時運行良好。 – ATMathew

回答

0

試試這個:

with(dat, table(our_bid, vehicle_driver_score)) 

我懷疑你有一些無人居住的細胞。通過編輯,似乎不大可能我認爲問題在於這兩個變量。儘管你有大量的案例說明,當模型用所有這些因子變量構造時,仍然有空單元仍然是可能的。