PolynomialFeatures LinearRegression ValueError：形狀不對齊

我正在嘗試編寫一個函數，該函數使用PolynomialFeatures來訓練和測試LinearRegression。這裏是我的代碼：PolynomialFeatures LinearRegression ValueError：形狀不對齊

def get_lr2(pdeg): 
    from sklearn.linear_model import LinearRegression 
    from sklearn.preprocessing import PolynomialFeatures 
    from sklearn.metrics.regression import r2_score 
    from sklearn.model_selection import train_test_split 
    import numpy as np 
    import pandas as pd 

    np.random.seed(0) 
    n = 15 
    x = np.linspace(0,10,n) + np.random.randn(n)/5 
    y = np.sin(x)+x/6 + np.random.randn(n)/10 
    X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=0) 
    test_data = np.linspace(0,10,100).reshape(100,1) 
    X_trainT  = X_train.reshape(-1,1) 
    y_trainT  = y_train.reshape(-1,1) 
    poly = PolynomialFeatures(degree=pdeg) 
    X_poly = poly.fit_transform(X_trainT) 
    X_train1, X_test1, y_train1, y_test1 = train_test_split(X_poly, y_trainT, random_state = 0) 
    linreg1 = LinearRegression().fit(X_train1, y_train1) 
    return linreg1.predict(test_data)

當我調用該函數（get_lr2（1））我得到

------------------------------------------------------------------------- 
    ValueError        Traceback (most recent call last) 
    ---> 84 get_lr2(1) 

    <ipython-input-29-a9966181155e> in get_lr2(pdeg) 
    23  X_train1, X_test1, y_train1, y_test1 = train_test_split(X_poly, y_trainT, random_state = 0) 
    24  linreg1 = LinearRegression().fit(X_train1, y_train1) 
    ---> 25  return linreg1.predict(test_data) 

    ValueError: shapes (100,1) and (2,1) not aligned: 1 (dim 1) != 2 (dim 0)

你能幫忙嗎？

來源

2017-09-25 Nick

測試和訓練數據的形狀（X，Y尺寸）似乎是不一樣的。你能檢查'test_data'和'X_train1'在你的函數中有相同的形狀嗎？ – ShreyasG

你的代碼很奇怪。讓我們嘗試重新格式化它在幾個方面：

Train_test _split。

你在做train_test_split然後扔掉你的測試集並創建另一個測試集。這很奇怪。如果你想讓你的火車測試分割尺寸按15/100的比例來分配，只需在train_test_split選項中設置。所以測試大小應該是100/(100+15) ~= 0.87。
預處理。

如果你想在這裏應用一些預處理（多項式特徵）變換器，你可以將它們應用於整個數據集，而不是一些分割。這是不正確的，如果變壓器依賴於數據（在這種情況下，你必須在列車上執行fit_transform，然後只在測試集上執行transform），但在你的情況下，這並不重要。
重塑。

經過我們的改進之後，您應該只在一處進行整形 - 在初始化x時。 Scikit學習模型期望你的X數據是矩陣或列向量（如果只有一個特徵出現）。所以reshape(-1,1)這裏會把你的行矢量變成列矢量。

因此，代碼如下所示：

def get_lr2(pdeg): 
    np.random.seed(0) 
    n = 115 
    x = (np.linspace(0,10,n) + np.random.randn(n)/5).reshape(-1,1) 
    y = np.sin(x)+x/6 + np.random.randn(n)/10 

    X_poly = PolynomialFeatures(degree=pdeg).fit_transform(x) 

    X_train, X_test, y_train, y_test = train_test_split(X_poly, y, random_state=0, test_size=0.87) 

    linreg1 = LinearRegression().fit(X_train, y_train) 
    return linreg1.predict(X_test) 

get_lr2(2)

來源

2017-09-25 15:01:59 Grigoriy

PolynomialFeatures LinearRegression ValueError：形狀不對齊

回答

相關問題