2

我正在學習gradient descent計算係數。以下是我在做什麼:多變量梯度下降

#!/usr/bin/Python 

import numpy as np 


    # m denotes the number of examples here, not the number of features 
def gradientDescent(x, y, theta, alpha, m, numIterations): 
    xTrans = x.transpose() 
    for i in range(0, numIterations): 
     hypothesis = np.dot(x, theta) 
     loss = hypothesis - y 
     # avg cost per example (the 2 in 2*m doesn't really matter here. 
     # But to be consistent with the gradient, I include it) 
     cost = np.sum(loss ** 2)/(2 * m) 
     #print("Iteration %d | Cost: %f" % (i, cost)) 
     # avg gradient per example 
     gradient = np.dot(xTrans, loss)/m 
     # update 
     theta = theta - alpha * gradient 
    return theta 


    X = np.array([41.9,43.4,43.9,44.5,47.3,47.5,47.9,50.2,52.8,53.2,56.7,57.0,63.5,65.3,71.1,77.0,77.8]) 
    y = np.array([251.3,251.3,248.3,267.5,273.0,276.5,270.3,274.9,285.0,290.0,297.0,302.5,304.5,309.3,321.7,330.7,349.0]) 
    n = np.max(X.shape) 
    x = np.vstack([np.ones(n), X]).T  
    m, n = np.shape(x) 
    numIterations= 100000 
    alpha = 0.0005 
    theta = np.ones(n) 
    theta = gradientDescent(x, y, theta, alpha, m, numIterations) 
    print(theta) 

現在我上面的代碼工作正常。如果我現在嘗試多個變量,用X1取代X類似如下:

X1 = np.array([[41.9,43.4,43.9,44.5,47.3,47.5,47.9,50.2,52.8,53.2,56.7,57.0,63.5,65.3,71.1,77.0,77.8], [29.1,29.3,29.5,29.7,29.9,30.3,30.5,30.7,30.8,30.9,31.5,31.7,31.9,32.0,32.1,32.5,32.9]]) 

然後我的代碼失敗,讓我看到以下錯誤:

JustTestingSGD.py:14: RuntimeWarning: overflow encountered in square 
    cost = np.sum(loss ** 2)/(2 * m) 
    JustTestingSGD.py:19: RuntimeWarning: invalid value encountered in subtract 
    theta = theta - alpha * gradient 
    [ nan nan nan] 

可有人告訴我,我怎麼可以用做gradient descentX1?我使用X1的預期輸出是:

[-153.5 1.24 12.08] 

我也對其他Python實現也是開放的。我只想coefficients (also called thetas)X1y

回答

1

問題在於你的算法沒有收斂。它反而分歧。第一誤差:

JustTestingSGD.py:14: RuntimeWarning: overflow encountered in square 
cost = np.sum(loss ** 2)/(2 * m) 

來自於在某一時刻計算某物的平方是不可能的問題,因爲在64位浮點數不能容納的數目(即,它是> 10^309)。

JustTestingSGD.py:19: RuntimeWarning: invalid value encountered in subtract 
theta = theta - alpha * gradient 

這只是以前的錯誤的後果。這些數字不適合計算。

通過取消註釋調試打印行,您實際上可以看到分歧。成本開始增長,因爲沒有收斂。

如果您嘗試使用X1的函數和alpha值較小的值,它會收斂。

+0

如果我用'alpha = 0.0001'計算'X1',那麼它會收斂,我得到以下結果:'[0.92429681 1.80242842 6.07549978]'但我期待類似'[-153.5 1.24 12.08]''。我怎樣才能得到想要的結果? – user227666