2017-04-21 135 views
2

讓我們考慮一個多元迴歸問題(2個響應變量:經度和緯度)。目前,像支持向量迴歸sklearn.svm.SVR這樣的一些機器學習模型實現本身不支持多變量回歸。爲此,可以使用sklearn.multioutput.MultiOutputRegressor通過MultiOutputRegressor的GridSearch?

一個例子:

from sklearn.multioutput import MultiOutputRegressor 
svr_multi = MultiOutputRegressor(SVR(),n_jobs=-1) 

#Fit the algorithm on the data 
svr_multi.fit(X_train, y_train) 
y_pred= svr_multi.predict(X_test) 

我的目標是格格不入的SVRsklearn.model_selection.GridSearchCV的參數。理想地,如果響應是一個單可變,而不是多個,我會如下進行操作:

from sklearn.svm import SVR 
from sklearn.model_selection import GridSearchCV 
from sklearn.pipeline import Pipeline 

pipe_svr = (Pipeline([('scl', StandardScaler()), 
        ('reg', SVR())])) 

grid_param_svr = { 
    'reg__C': [0.01,0.1,1,10], 
    'reg__epsilon': [0.1,0.2,0.3], 
    'degree': [2,3,4] 
} 

gs_svr = (GridSearchCV(estimator=pipe_svr, 
        param_grid=grid_param_svr, 
        cv=10, 
        scoring = 'neg_mean_squared_error', 
        n_jobs = -1)) 

gs_svr = gs_svr.fit(X_train,y_train) 

然而,如我的響應y_train是2維我需要使用MultiOutputRegressor上SVR的頂部。我如何修改上面的代碼來啓用這個GridSearch操作?如果不可能,是否有更好的選擇?

回答

3

我剛找到一個工作解決方案。在嵌套估計器的情況下,內部估計器的參數可以通過estimator__訪問。

from sklearn.multioutput import MultiOutputRegressor 
from sklearn.svm import SVR 
from sklearn.model_selection import GridSearchCV 
from sklearn.pipeline import Pipeline 

pipe_svr = Pipeline([('scl', StandardScaler()), 
     ('reg', MultiOutputRegressor(SVR()))]) 

grid_param_svr = { 
    'reg__estimator__C': [0.1,1,10] 
} 

gs_svr = (GridSearchCV(estimator=pipe_svr, 
         param_grid=grid_param_svr, 
         cv=2, 
         scoring = 'neg_mean_squared_error', 
         n_jobs = -1)) 

gs_svr = gs_svr.fit(X_train,y_train) 
gs_svr.best_estimator_  

Pipeline(steps=[('scl', StandardScaler(copy=True, with_mean=True, with_std=True)), 
('reg', MultiOutputRegressor(estimator=SVR(C=10, cache_size=200, 
coef0=0.0, degree=3, epsilon=0.1, gamma='auto', kernel='rbf', max_iter=-1,  
shrinking=True, tol=0.001, verbose=False), n_jobs=1))])