2011-12-14 130 views
3

我想乘以一些數據,並且它以一種非常奇怪的方式失敗。我有選項(錯誤=恢復)設置,並且mi()不會引發錯誤,但它會在第一次嘗試後停止輸入。它也不會返回一個值。因此我不知道從哪裏開始進行調試。下面的最小重現性示例。沒有錯誤被拋出的錯誤

> library(mi) 
> temp <- mi(dat) 
Beginning Multiple Imputation (Wed Dec 14 10:44:44 2011): 
Iteration 1 
Chain 1 : HLTHA5.fac* BMI* INCOME* 
> temp 
Error: object 'temp' not found 


dat<-structure(list(treat = c(FALSE, FALSE, TRUE, TRUE, TRUE, FALSE, 
FALSE, FALSE, TRUE, FALSE, FALSE, TRUE, FALSE, TRUE, TRUE, TRUE, 
FALSE, TRUE, TRUE, TRUE, FALSE, TRUE, TRUE, FALSE, TRUE), NUMADULT = c(2, 
1, 2, 1, 2, 1, 2, 1, 2, 4, 1, 1, 2, 2, 1, 2, 1, 1, 2, 1, 2, 2, 
3, 3, 1), HLTHA5.fac = structure(c(3L, NA, 3L, 2L, 4L, 5L, 5L, 
4L, 4L, 3L, 3L, 5L, 3L, 4L, 5L, 4L, 2L, 2L, 3L, 5L, 4L, 5L, 4L, 
3L, 3L), .Label = c("0", "1", "2", "3", "4"), class = "factor"), 
    SOURCEA = structure(c(1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L), .Label = c("Yes", "No", "Don't know", "Refused"), class = "factor"), 
    BMI = c(27.363941459046, 24.0265857515842, 34.3236939308346, 
    27.0907152026518, 32.6101901381975, 34.1643655360753, 21.4628674188624, 
    29.1751398412094, 22.5924920198551, 39.6719545438681, 38.5220574557939, 
    20.1156133421915, 30.6612391698034, 35.7332536282609, 26.5664872147956, 
    25.6016897082437, 19.3649931598758, 28.1868713091175, NA, 
    32.4438116170843, 32.5507197719099, 21.1090717674633, 32.2340044872853, 
    24.3699149340904, 27.3369153440247), SMOKE2 = structure(c(2L, 
    1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L), .Label = c("Yes", "No" 
    ), class = "factor"), INCOME = structure(c(16L, 4L, 13L, 
    11L, 13L, 7L, 22L, 6L, NA, 1L, 13L, 18L, 12L, 20L, NA, NA, 
    2L, 13L, 17L, NA, 12L, 21L, 9L, 15L, 13L), .Label = c("Less than $10,800", 
    "$10,800-$14,600", "$14,601-$16,250", "$16,251-$18,300", 
    "$18,301-$21,800", "$21,801-$25,000", "$25,001-$27,500", 
    "$27,501-$29,300", "$29,301-$33,100", "$33,101-$36,700", 
    "$36,701-$38,700", "$38,701-$44,200", "$44,201-$50,000", 
    "$50,001-$58,000", "$58,001-$66,500", "$66,501-$73,500", 
    "$73,501-$80,000", "$80,001-88,200", "$88,201-$100,000", 
    "$100,001-$120,000", "$120,001-$130,000", "$130,001-$150,000", 
    "$150,001-$250,000", "Over $250,000", "Don't know", "Refused" 
    ), class = "factor"), RESPMAR = structure(c(1L, 5L, 1L, 4L, 
    3L, 6L, 1L, 1L, 1L, 1L, 4L, 4L, 1L, 1L, 4L, 1L, 3L, 3L, 1L, 
    3L, 1L, 1L, 1L, 1L, 1L), .Label = c("Married", "Living w partner", 
    "Widowed", "Divorced", "Separated", "Single", "Other", "Don't know", 
    "Refused"), class = "factor"), RESPGRAD = structure(c(5L, 
    1L, 2L, 5L, 3L, 3L, 5L, 2L, 4L, 2L, 4L, 3L, 4L, 4L, 2L, 3L, 
    2L, 5L, 2L, 4L, 4L, 5L, 2L, 2L, 3L), .Label = c("< HS 0-11", 
    "HS graduate", "Some colge 13-15", "Collge grad 16", "Post college 16+", 
    "Don't know", "Refused"), class = "factor"), RACEA2 = structure(c(1L, 
    1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 
    3L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L), .Label = c("White (Not-Latino)", 
    "Black (Not-Latino)", "Latino (total)", "Asian", "Biracial/Multi", 
    "Native American", "Other", "Don't know", "Refused"), class = "factor"), 
    INSUREDA = structure(c(1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 
    1L), .Label = c("Insured", "Not insured", "Don't know", "Refused" 
    ), class = "factor"), PAP.adj = c(TRUE, FALSE, FALSE, FALSE, 
    FALSE, TRUE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, TRUE, 
    TRUE, TRUE, TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, 
    TRUE, TRUE)), .Names = c("treat", "NUMADULT", "HLTHA5.fac", 
"SOURCEA", "BMI", "SMOKE2", "INCOME", "RESPMAR", "RESPGRAD", 
"RACEA2", "INSUREDA", "PAP.adj"), row.names = c(1L, 13L, 15L, 
23L, 26L, 33L, 38L, 53L, 56L, 60L, 62L, 85L, 109L, 116L, 138L, 
217L, 240L, 262L, 264L, 269L, 277L, 295L, 328L, 334L, 338L), class = "data.frame") 

任何想法從哪裏開始?

更新

由於下面的診斷技術,我就找到了錯誤,這我在這裏總結,因爲它似乎我不是有這個問題的唯一一個。

當您具有無級別的無序分類變量且沒有值時,會發生此錯誤。 mi.default調用.initializeConvCheckArray以使用NAs填充AveVar。該功能使用這些級別,而不管這些級別是否被使用。相比之下,爲了填寫AveVar,它會調用.getmean,這會降低未使用的級別。因此尺寸不匹配。

用戶端的簡單解決方案當然是在調用mi.info和mi之前刪除額外的級別。然而,我將提交錯誤修復程序給套件作者,因爲已經花費太多的時間來跟蹤這個問題了。

+0

'調試(MI)'和單步執行代碼? Nevermind,'mi'是S4通用的,'debug(mi)'不提供任何有用的東西。 –

回答

7

由於error=recover選項無法正常工作,因此可行的替代方法是設置options(error=dump.frames)。這將讓你對錯誤的一些信息,您可以打印出來,或者更有效,檢查與debugger()

ls() 
# [1] "dat" 
options(error=dump.frames) 
mi(dat) 
ls() 
# [1] "dat"  "last.dump" # Apparently there WAS an error 

# INVESTIGATE WITH debugger() 
debugger(dump=last.dump) 

# ALTERNATIVELY, PRINT last.dump TO CONSOLE 
last.dump 

$`mi(dat)` 
<environment: 0x05155c44> 

$`mi(dat)` 
<environment: 0x05158f30> 

$`.local(object, ...)` 
<environment: 0x05158cac> 

$`mi.default(object, info, n.imp, n.iter, R.hat, max.minutes, rand.imp.method` 
<environment: 0x047dc3a0> 

attr(,"error.message") 
[1] "Error in aveVar[s, i, ] <- c(avevar.mean, avevar.sd) : 
\n number of items to replace is not a multiple of replacement length\n" 
attr(,"class") 
[1] "dump.frames" 
+0

這是一個非常整潔的把戲! –

+0

太棒了。謝謝。 –

+0

@PaulHiemstra和gsk3 - 是的,我已經閱讀了幾年前,但從未發現它的需要。即使錢伯斯(2008)也說:「通常,dump.frames()沒有優勢,因爲如果計算不是交互的,recover()的行爲就像dump.frames()一樣。」 –