路徑scikit學習

與熊貓數據幀，d_train（774行）開始：路徑scikit學習

的想法是仿效here調查嶺係數路徑。

在這個例子中，這裏的變量類型：

X, y, w = make_regression(n_samples=10, n_features=10, coef=True, 
          random_state=1, bias=3.5) 
print X.shape, type(X), y.shape, type(y), w.shape, type(w) 

>> (10, 10) <type 'numpy.ndarray'> (10,) <type 'numpy.ndarray'> (10,) <type'numpy.ndarray'>

爲了避免this stackoverflow discussion提到的問題，我將所有的都以numpy的數組：

predictors = ['p1', 'p2', 'p3', 'p4'] 
target = ['target_bins'] 
X = d_train[predictors].as_matrix() 
### X = np.transpose(d_train[predictors].as_matrix()) 
y = d_train['target_bins'].as_matrix() 
w = numpy.full((774,), 3, dtype=float) 
print X.shape, type(X), y.shape, type(y), w.shape, type(w) 
>> (774, 4) <type 'numpy.ndarray'> y_shape: (774,) <type 'numpy.ndarray'>  w_shape: (774,) <type 'numpy.ndarray'>

然後，我只是跑（ a）示例中的確切代碼，（b）將參數fit_intercept = True, normalize = True添加到嶺調用（我的數據未縮放）以獲得相同的錯誤消息：

my_ridge = Ridge() 
coefs = [] 
errors = [] 
alphas = np.logspace(-6, 6, 200) 

for a in alphas: 
    my_ridge.set_params(alpha=a, fit_intercept = True, normalize = True) 
    my_ridge.fit(X, y) 
    coefs.append(my_ridge.coef_) 
    errors.append(mean_squared_error(my_ridge.coef_, w)) 
>> ValueError: Found input variables with inconsistent numbers of samples: [4, 774]

正如代碼的註釋部分所示，我也嘗試了「相同」代碼，但使用了轉置的X矩陣。在創建X matrix之前，我也試過縮放的數據。得到了同樣的錯誤信息。

最後，我使用'RidgeClassifier'做了同樣的事情，並試圖獲得不同的錯誤消息。

>> Found input variables with inconsistent numbers of samples: [1, 774]

問題：我不知道是怎麼回事 - 你可以請幫助？

冠層1.7.4.3348（64位）使用Python 2.7 scikit學習18.01-3和熊貓0.19.2-2

謝謝。

來源

2017-02-03 user2738815

您需要擁有儘可能多的權重w，因爲您具有許多特徵（因爲您可以學習每個特徵的單個權重），但是在您的代碼中，權重向量的維數爲774（這是訓練中的行數數據集），這就是爲什麼它不起作用。修改代碼如下（有4個權重來代替），這樣就可以了：

w = np.full((4,), 3, dtype=float) # number of features = 4, namely p1, p2, p3, p4 
print X.shape, type(X), y.shape, type(y), w.shape, type(w) 
#(774L, 4L) <type 'numpy.ndarray'> (774L,) <type 'numpy.ndarray'> (4L,) <type 'numpy.ndarray'>

現在你可以從http://scikit-learn.org/stable/auto_examples/linear_model/plot_ridge_coeffs.html#sphx-glr-auto-examples-linear-model-plot-ridge-coeffs-py運行代碼的剩下的就看權重和錯誤是如何改變與調整參數alpha與網格搜索並獲得下圖

來源

2017-02-22 18:52:20

謝謝Mr.Dey，用於校正滑。 – user2738815

路徑scikit學習

回答

相關問題