2017-08-12 75 views
1

當試圖使用例如獲取召回得分時,h2o ValueError:No metric tpr

rf_model.recall() 

我得到的錯誤:

h2o ValueError: No metric tpr 

我能得到其他指標,如準確性,AUC,精度和F1,但沒有召回... 這可能是一個錯誤。

如果我運行:

from h2o.model.metrics_base import H2OBinomialModelMetrics as bmm 
reporter = bmm(rf_model.metric) 
rf_model.metric('recall') 

我得到:

Could not find exact threshold 0.0; using closest threshold found 0.0. 

這是怎麼回事?

我正在運行h2o版本'h2o-3.15.0.3990'。

我跟着H2O教程:

https://github.com/h2oai/h2o-tutorials/blob/master/training/h2o_algos/src/py/decision_tree_ensembles.ipynb

,並用自己的數據集,我得到上述錯誤。

任何幫助?

此外,如何使用h2o繪製精度/回憶曲線?

感謝

+0

請不要與郵件列表交叉發佈。(StackOverflow對於這類問題來說是更好的選擇。) –

回答

1

你第二個問題開始,流量有精度/召回曲線(它是互動)。流程始終在每個節點的端口54321上運行,如果您在本地運行h2o,則流程爲http://127.0.0.1:54321

我想你的數據或模型有一些有趣的地方,當你看到精度/回憶曲線時,它將變得清晰。

在R如果你這樣做str(m)(其中m是你的型號),你會看到所有的模型數據。 [email protected][email protected]$thresholds_and_metric_scores$recall保存每個閾值的召回號碼。

我無法弄清楚如何查看Python對象,但是你的調用是正確的。在我的快速測試(有2類ENUM列虹膜數據集添加):

m.metric("recall") 

了:

[[0.8160852636726422, 1.0]] 

如果我想所有的值,這將是這樣的:

mDL.metric("recall",thresholds=[x/100.0 for x in range(1,100)]) 

,並提供:

Could not find exact threshold 0.01; using closest threshold found 0.010396965719556233. 
Could not find exact threshold 0.02; using closest threshold found 0.016617060110009896. 
... 
Could not find exact threshold 0.92; using closest threshold found 0.9469528904679438. 
Could not find exact threshold 0.93; using closest threshold found 0.9469528904679438. 
Could not find exact threshold 0.94; using closest threshold found 0.9469528904679438. 
Could not find exact threshold 0.95; using closest threshold found 0.9469528904679438. 
Could not find exact threshold 0.96; using closest threshold found 0.9469528904679438. 
Could not find exact threshold 0.97; using closest threshold found 0.9760293572153097. 
Could not find exact threshold 0.98; using closest threshold found 0.9787491606489236. 
Could not find exact threshold 0.99; using closest threshold found 0.9909817370067531. 

[[0.01, 1.0], 
[0.02, 1.0], 
[0.03, 1.0], 
... 
[0.87, 1.0], 
[0.88, 1.0], 
[0.89, 0.9850746268656716], 
[0.9, 0.9850746268656716], 
[0.91, 0.9850746268656716], 
[0.92, 0.9850746268656716], 
[0.93, 0.9850746268656716], 
[0.94, 0.9850746268656716], 
[0.95, 0.9850746268656716], 
[0.96, 0.9850746268656716], 
[0.97, 0.9701492537313433], 
[0.98, 0.9552238805970149], 
[0.99, 0.8955223880597015]] 

(我得到如此不尋常的輸出,因爲它學到了我的數據集幾乎完美 - 我懷疑這是發生在你身上?)(我愚蠢地讓我的二進制列成爲輸入列之一的直接函數,沒有噪聲!)