Python PolynomialFeatures將數據轉換爲原始數據的不同形狀

我使用sklearn的PolynomialFeatures將數據預處理爲各種度數轉換，以便比較其模型擬合。下面是我的代碼：Python PolynomialFeatures將數據轉換爲原始數據的不同形狀

from sklearn.linear_model import LinearRegression 
 
from sklearn.preprocessing import PolynomialFeatures 
 
from sklearn.model_selection import train_test_split 
 
np.random.seed(0) 
 
# x and y are the original data 
 
n = 100 
 
x = np.linspace(0,10,n) + np.random.randn(n)/5 
 
y = np.sin(x)+n/6 + np.random.randn(n)/10 
 
# using .PolynomialFeatures and fit_transform to transform original data to degree 2 
 
poly1 = PolynomialFeatures(degree=2) 
 
x_D2_poly = poly1.fit_transform(x) 
 
#check out their dimensions 
 
x.shape 
 
x_D2_poly.shape

然而，上述變換從的原始x返回的（1，5151）的陣列（100，1）。這不是我所期望的。我無法弄清楚我的代碼有什麼問題。如果有人能指出我的代碼的錯誤或者我的錯誤概念，那將會很棒。我應該使用替代方法來轉換原始數據嗎？

謝謝。

此致

[更新] 所以經過我用X = x.reshape（-1,1）來轉換的原始x，Python不給我所需的輸出尺寸（100，1）通過POLY1 .fit_transform（X）。然而，當我做了train_test_split，裝的數據，並嘗試以獲得預測值：

x_poly1_train, x_poly1_test, y_train, y_test = train_test_split(x_poly1, y, random_state = 0) 
 
linreg = LinearRegression().fit(x_poly1_train, y_train) 
 
poly_predict = LinearRegression().predict(x)

的Python返回錯誤消息：

shapes (1,100) and (2,) not aligned: 100 (dim 1) != 2 (dim 0)

顯然，必須有某處我得到了尺寸錯誤的地方獲得。任何人都可以對此有所瞭解嗎？

謝謝。

來源

2017-06-10 Chris T.

我已經回答了您編輯的問題，但由於我確實回答了您的第一個問題，您是否介意接受答案？ –

再次感謝。我很抱歉無法早日投票答覆您，因爲我對理解重塑（）的事情有點專注......：p –

我認爲你需要重塑你的X狀

x=x.reshape(-1,1)

你的X形狀了（100）未（100,1）和fit_transform預計2個維度。您獲得5151功能的原因是，您看到每個不同對（100 * 99/2 = 4950）的一個功能，每個功能的一個功能平方（100），每個功能的第一個功能的功能（100），以及一個0次方（1）。

對編輯問題的迴應：您需要致電transform轉換您希望預測的數據。

來源

2017-06-10 17:59:54

謝謝！我檢查了文檔，它說-1意味着「新形狀應該與原始形狀兼容」，並且當人們進入重塑（-1,1）時，這意味着我們希望Python找出「長度數組和剩餘尺寸「，以便它對應於原始形狀。我是否正確地想到了這一點？ –

使用-1意味着「根據其他提供的尺寸計算出此維度。在這種情況下，我們說我們需要1列和100/1行。 –

Python PolynomialFeatures將數據轉換爲原始數據的不同形狀

回答

相關問題