sklearn auc ValueError：只有一個類在y_true

我搜索谷歌，並查看關於此錯誤的stackoverflow帖子。他們不是我的例子。sklearn auc ValueError：只有一個類在y_true

我使用keras來訓練一個簡單的神經網絡，並對分裂的測試數據集進行一些預測。但是當使用roc_auc_score來計算auc時，我得到了「ValueError：y_true中只存在一個類，在這種情況下ROC AUC分數沒有被定義。」

我檢查目標標籤分佈，並且它們高度不平衡。一些標籤（共29個標籤）只有1個實例。所以他們很可能在測試標籤中沒有正面的標籤實例。所以sklearn的roc_auc_score函數報告了唯一的一個類問題。這是合理的。

但我很好奇，因爲當我使用sklearn的cross_val_score函數時，它可以無誤地處理AUC計算。

my_metric = 'roc_auc' 
scores = cross_validation.cross_val_score(myestimator, data, 
            labels, cv=5,scoring=my_metric)

我不知道在cross_val_score發生了什麼，是不是因爲cross_val_score使用分層交叉驗證的數據拆分？

== == UPDATE
我繼續做一些挖掘，但仍無法找到差異behind.I看到cross_val_score呼叫check_scoring(estimator, scoring=None, allow_none=False)返回一個得分手，而check_scoring將調用get_scorer(scoring)將返回scorer=SCORERS[scoring]

而且SCORERS['roc_auc']是roc_auc_scorer;
的roc_auc_scorer由

roc_auc_scorer = make_scorer(roc_auc_score, greater_is_better=True, 
           needs_threshold=True)

那麼做，它仍然使用roc_auc_score函數。我不明白爲什麼cross_val_score與直接調用roc_auc_score的行爲有所不同。

來源

2016-08-18 Allan Ruin

是什麼'my_metric'？ – maxymoo

@maxymoo我使用字符串「roc_auc」，它是一個有效的值。 –

如果您進行交叉驗證並且您的某種標籤太少，則某些摺疊可能沒有任何此類標籤。嘗試減少摺疊次數，並確保使用分層採樣。 – Kris

我認爲你的預感是正確的。 AUC（ROC曲線下的區域）需要足夠數量的任一類纔能有意義。

默認情況下，cross_val_score分別計算每個摺疊的性能指標。另一種選擇可以是cross_val_predict並計算合併的所有摺疊的AUC。

你可以這樣做：

from sklearn.metrics import roc_auc_score 
from sklearn.cross_validation import cross_val_predict 
from sklearn.linear_model import LogisticRegression 
from sklearn.datasets import make_classification 


class ProbaEstimator(LogisticRegression): 
    """ 
    This little hack needed, because `cross_val_predict` 
    uses `estimator.predict(X)` internally. 

    Replace `LogisticRegression` with whatever classifier you like. 

    """ 
    def predict(self, X): 
     return super(self.__class__, self).predict_proba(X)[:, 1] 


# some example data 
X, y = make_classification() 

# define your estimator 
estimator = ProbaEstimator() 

# get predictions 
pred = cross_val_predict(estimator, X, y, cv=5) 

# compute AUC score 
roc_auc_score(y, pred)

來源

2016-08-19 19:56:18 Kris

sklearn auc ValueError：只有一個類在y_true

回答

相關問題