ConfusionMatrix中的錯誤數據和參考因子必須具有相同的層數R CARET

我已經用R符號訓練了樹模型。現在我想產生混淆矩陣和不斷收到以下錯誤：產生混淆矩陣時，會發生ConfusionMatrix中的錯誤數據和參考因子必須具有相同的層數R CARET

Error in confusionMatrix.default(predictionsTree, testdata$catgeory) : the data and reference factors must have the same number of levels

prob <- 0.5 #Specify class split 
singleSplit <- createDataPartition(modellingData2$category, p=prob, 
            times=1, list=FALSE) 
cvControl <- trainControl(method="repeatedcv", number=10, repeats=5) 
traindata <- modellingData2[singleSplit,] 
testdata <- modellingData2[-singleSplit,] 
treeFit <- train(traindata$category~., data=traindata, 
       trControl=cvControl, method="rpart", tuneLength=10) 
predictionsTree <- predict(treeFit, testdata) 
confusionMatrix(predictionsTree, testdata$catgeory)

錯誤。兩個對象的級別相同。我無法弄清楚問題所在。他們的結構和水平如下。他們應該是一樣的。任何幫助將不勝感激，因爲它使我破解！

> str(predictionsTree) 
Factor w/ 30 levels "16-Merchant Service Charge",..: 28 22 22 22 22 6 6 6 6 6 ... 
> str(testdata$category) 
Factor w/ 30 levels "16-Merchant Service Charge",..: 30 30 7 7 7 7 7 30 7 7 ... 

> levels(predictionsTree) 
[1] "16-Merchant Service Charge" "17-Unpaid Cheque Fee"   "18-Gov. Stamp Duty"   "Misc"       "26-Standard Transfer Charge" 
[6] "29-Bank Giro Credit"   "3-Cheques Debit"    "32-Standing Order - Debit" "33-Inter Branch Payment"  "34-International"    
[11] "35-Point of Sale"    "39-Direct Debits Received" "4-Notified Bank Fees"   "40-Cash Lodged"    "42-International Receipts" 
[16] "46-Direct Debits Paid"  "56-Credit Card Receipts"  "57-Inter Branch"    "58-Unpaid Items"    "59-Inter Company Transfers" 
[21] "6-Notified Interest Credited" "61-Domestic"     "64-Charge Refund"    "66-Inter Company Transfers" "67-Suppliers"     
[26] "68-Payroll"     "69-Domestic"     "73-Credit Card Payments"  "82-CHAPS Fee"     "Uncategorised" 

> levels(testdata$category) 
[1] "16-Merchant Service Charge" "17-Unpaid Cheque Fee"   "18-Gov. Stamp Duty"   "Misc"       "26-Standard Transfer Charge" 
[6] "29-Bank Giro Credit"   "3-Cheques Debit"    "32-Standing Order - Debit" "33-Inter Branch Payment"  "34-International"    
[11] "35-Point of Sale"    "39-Direct Debits Received" "4-Notified Bank Fees"   "40-Cash Lodged"    "42-International Receipts" 
[16] "46-Direct Debits Paid"  "56-Credit Card Receipts"  "57-Inter Branch"    "58-Unpaid Items"    "59-Inter Company Transfers" 
[21] "6-Notified Interest Credited" "61-Domestic"     "64-Charge Refund"    "66-Inter Company Transfers" "67-Suppliers"     
[26] "68-Payroll"     "69-Domestic"     "73-Credit Card Payments"  "82-CHAPS Fee"     "Uncategorised"

來源

2014-07-17 user2987739

在你的錯誤中，'category'拼寫爲'catgeory'。如果問題不相關，那麼'identical（levels（predictionsTree），levels（testdata $ category））'的輸出是什麼？ – fxi

嗨，謝謝你，我讚揚愚蠢的拼寫錯誤.... doh！我運行了相同的功能，它輸出[1] TRUE .........現在我遇到以下錯誤，當我運行confusionMatrix函數.....表中的錯誤（數據，參考，dnn = dnn，...）：所有參數必須具有相同的長度 – user2987739

檢查另一個拼寫錯誤的'catgeory'，檢查'length（testdata $ category）'和'length（predictionsTree'），並檢查兩個向量的總結。只需要一個簡單的混淆矩陣：'table（predictionsTree，testdata $ category）' – fxi

也許你的模型沒有預測到某個因素。使用table（）函數而不是confusionMatrix（）來查看是否有問題。

來源

2014-10-31 05:36:44 Red

您可以將其添加爲註釋。 –

-2

可能是測試數據中缺少值，請在「predictionsTree < - predict（treeFit，testdata）」之前添加以下行以刪除NA。我有同樣的錯誤，現在它適用於我。

testdata <- testdata[complete.cases(testdata),]

來源

2015-01-11 07:12:01 EaswerC

你正在運行到長度問題可能是由於到NAS的訓練集中存在 - 要麼丟棄不完整的情況下，或歸罪於讓你沒有缺失值。

來源

2015-05-21 21:06:38 orange1

嘗試指定na.pass爲na.action選項：

predictionsTree <- predict(treeFit, testdata,na.action = na.pass)

來源

2015-11-12 03:02:11 aristotll

我有同樣的問題，而是繼續和讀取，像這樣的數據文件後，改變了它..

data = na.omit(data)

感謝所有爲指針！

來源

2015-11-21 18:54:00 Alicia

ConfusionMatrix中的錯誤數據和參考因子必須具有相同的層數R CARET

回答

相關問題