從SKLearn使用GridSearchCV時的JobLibValueError

我正在嘗試使用SciKit-Learn的網格搜索來查找我的隨機森林的最佳參數。我這樣做如下：從SKLearn使用GridSearchCV時的JobLibValueError

from sklearn.metrics import classification_report 
from sklearn.pipeline import Pipeline 
from sklearn.grid_search import GridSearchCV 

pipeline = Pipeline([('clf', RandomForestRegressor(random_state=50))]) 
parameters = { 
'clf__n_estimators': (50, 100, 200), 
'clf__max_depth': (50, 150, 250), 
'clf__min_samples_split': (1, 2, 3, 4, 5), 
'clf__min_samples_leaf': (1, 2, 3, 4, 5) 
} 

grid_search = GridSearchCV(pipeline, parameters, n_jobs=-1,verbose=1, scoring='neg_mean_squared_error') 
grid_search.fit(X, Y) 
print 'Best score: %0.3f' % grid_search.best_score_ 
print 'Best parameters set:' 

best_parameters = grid_search.best_estimator_.get_params() 
for param_name in sorted(parameters.keys()): 
    print '\t%s: %r' % (param_name, best_parameters[param_name]) 

predictions = grid_search.predict(X) 
print classification_report(Y, predictions)

不幸的是，我得到一個JobLibValueError指向：

---> 14 grid_search.fit(X, Y)

僅供參考，我的X是這樣的：

0 1 2 3 4 5 6 7 8 9 ... 76613 76614 76615 76616 76617 76618 76619 76620 _engaged_time _title 
0 0.0 0.000000 0.000000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 20000.0 54 
1 0.0 0.000000 0.000000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 55000.0 40

和我Y值只是一羣參與時間（整數）。

感謝您的幫助！

來源

2017-07-02 bclayman

你能發佈完整的堆棧跟蹤錯誤嗎？ –

你爲什麼要添加一個單獨的操作到'Pipeline'？ –

我發佈了一個可能的解決方案。你可以上傳X和Y來嘗試重現錯誤嗎？ – sera

嘗試

1）取代：

from sklearn.grid_search import GridSearchCV

與

from sklearn.model_selection import GridSearchCV

2）來更新sklearn模塊

pip install -U scikit-learn或conda install scikit-learn

解決方案1）解決了我所遇到的類似問題。

來源

2017-07-03 14:21:28 sera

從SKLearn使用GridSearchCV時的JobLibValueError

回答

相關問題