2016-03-21 63 views
-1

嘗試這種代碼:線性迴歸返回不同的結果綜合參數

from sklearn import linear_model 
import numpy as np 

x1 = np.arange(0,10,0.1) 
x2 = x1*10 

y = 2*x1 + 3*x2 
X = np.vstack((x1, x2)).transpose() 

reg_model = linear_model.LinearRegression() 
reg_model.fit(X,y) 

print reg_model.coef_ 
# should be [2,3] 

print reg_model.predict([5,6]) 
# should be 2*5 + 3*6 = 28 

print reg_model.intercept_ 
# perfectly at the expected value of 0 

print reg_model.score(X,y) 
# seems to be rather confident to be right 

的結果是

  • [0.31683168 3.16831683]
  • 20.5940594059
  • 0.0
  • 1.0

因此不是我所期望的 - 它們與用於合成數據的參數不同。這是爲什麼?

回答

0

您的問題在於解決方案的獨特性,因爲兩個維度都是相同的(對一個維度應用線性變換不會在此模型的眼中產生獨特的數據),您將獲得無限數量的可能解決方案適合你的數據。將非線性變換應用於第二維,您將看到所需的輸出。

from sklearn import linear_model 
import numpy as np 

x1 = np.arange(0,10,0.1) 
x2 = x1**2 
X = np.vstack((x1, x2)).transpose() 
y = 2*x1 + 3*x2 

reg_model = linear_model.LinearRegression() 
reg_model.fit(X,y) 
print reg_model.coef_ 
# should be [2,3] 

print reg_model.predict([[5,6]]) 
# should be 2*5 + 3*6 = 28 

print reg_model.intercept_ 
# perfectly at the expected value of 0 

print reg_model.score(X,y) 

輸出是

  • [ 2. 3.]
  • [ 28.]
  • -2.84217094304e-14
  • 1.0