2017-03-17 41 views
2

我想在sklearn中使用隨機搜索和分組k摺疊交叉驗證發生器實施網格搜索參數。以下作品:sklearn網格搜索與分組K摺疊cv發電機

skf=StratifiedKFold(n_splits=5,shuffle=True,random_state=0) 
rs=sklearn.model_selection.RandomizedSearchCV(clf,parameters,scoring='roc_auc',cv=skf,n_iter=10) 
rs.fit(X,y) 

這不

gkf=GroupKFold(n_splits=5) 
rs=sklearn.model_selection.RandomizedSearchCV(clf,parameters,scoring='roc_auc',cv=gkf,n_iter=10) 
rs.fit(X,y) 

#ValueError: The groups parameter should not be None 

如何指示groups參數?

無論這是否

gkf=GroupKFold(n_splits=5) 
fv = gkf.split(X, y, groups=groups) 
rs=sklearn.model_selection.RandomizedSearchCV(clf,parameters,scoring='roc_auc',cv=gkf,n_iter=10) 
rs.fit(X,y) 

#TypeError: object of type 'generator' has no len() 

回答

2

作爲參考,這是通過

rs.fit(X,y,groups=groups) 

做了

rs=sklearn.model_selection.RandomizedSearchCV(forest,parameters,scoring='roc_auc',cv=gkf,n_iter=10)