2014-01-12 31 views
0

我想使用weka的邏輯迴歸。有什麼辦法可以告訴weka嘗試將某種類型的錯誤減至最少?我不介意分類爲b的更多錯誤,但我想最小化分類爲a的b的數量。Weka - 某種類型的限制分類錯誤

這是我的輸出:

Logistic Regression with ridge parameter of 1.0E-8 
Coefficients... 
            Class 
Variable        yes 
========================================= 
cmapArithAvg      28.9022 
cnllArithAvg      1.8342 
cmapGeoAvg      -92.0111 
cnllGeoAvg      -0.6321 
avgCatchAllScorer      0 
cmapMin      -15333.0622 
cmapMinInternal    15210.7515 
cnllMin       0.0267 
cmapStdev       -0.9583 
cnllStdev       -2.0748 
numphones       0.3234 
Intercept       12.3432 


Odds Ratios... 
            Class 
Variable        yes 
========================================= 
cmapArithAvg   3.564876537642066E12 
cnllArithAvg      6.2601 
cmapGeoAvg        0 
cnllGeoAvg       0.5315 
avgCatchAllScorer      1 
cmapMin         0 
cmapMinInternal     Infinity 
cnllMin       1.0271 
cmapStdev       0.3835 
cnllStdev       0.1256 
numphones       1.3818 


Time taken to build model: 0.67 seconds 
Time taken to test model on training data: 0.28 seconds 

=== Error on training data === 

Correctly Classified Instances  11383    95.2791 % 
Incorrectly Classified Instances  564    4.7209 % 
Kappa statistic       0.7434 
Mean absolute error      0.0723 
Root mean squared error     0.1883 
Relative absolute error     36.4503 % 
Root relative squared error    59.8021 % 
Total Number of Instances   11947  


=== Confusion Matrix === 

    a  b <-- classified as 
10442 171 |  a = yes 
    393 941 |  b = no 



=== Stratified cross-validation === 

Correctly Classified Instances  11376    95.2206 % 
Incorrectly Classified Instances  571    4.7794 % 
Kappa statistic       0.7401 
Mean absolute error      0.0726 
Root mean squared error     0.189 
Relative absolute error     36.5861 % 
Root relative squared error    60.0198 % 
Total Number of Instances   11947  


=== Confusion Matrix === 

    a  b <-- classified as 
10439 174 |  a = yes 
    397 937 |  b = no 
+0

AFAIK沒有直接的方法來做到這一點。大多數分類器沒有這個概念。你可以嘗試的是複製你想要更加強調的類的實例。 –

回答

1

你可以試試成本敏感的分類。您可以定義一個成本矩陣,爲那些您想要最小化的錯誤分配更大的成本,並且由於大多數分類器嘗試最小化平均錯誤,他們將嘗試避免這些錯誤。

您可以使用元分類器CostSensitiveClassifier在WEKA中執行此操作。 Weka Explorer中的示例顯示在this blog post中。