在python中實現天真的漸變下降

我試圖在python中實現一個非常天真的漸變下降。但是，它看起來像進入了一個無限循環。你能幫我調試嗎？在python中實現天真的漸變下降

y = lambda x : x**2 
dy_dx = lambda x : 2*x 
def gradient_descent(function,derivative,initial_guess): 
    optimum = initial_guess 
    while derivative(optimum) != 0: 
     optimum = optimum - derivative(optimum) 
    else: 
     return optimum 
gradient_descent(y,dy_dx,5)

編輯：

現在我有這樣的代碼，我真的無法理解的輸出。附：它可能會凍結你的CPU。

y = lambda x : x**2 
dy_dx = lambda x : 2*x 
def gradient_descent(function,derivative,initial_guess): 
    optimum = initial_guess 
    while abs(derivative(optimum)) > 0.01: 
     optimum = optimum - 2*derivative(optimum) 
     print((optimum,derivative(optimum))) 
    else: 
     return optimum 
gradient_descent(y,dy_dx,5)

現在我想將它應用到迴歸問題，但是輸出似乎不正確如下圖所示的輸出：

Output of gradient descent code below

import matplotlib.pyplot as plt 
def stepGradient(x,y, step): 
    b_current = 0 
    m_current = 0 
    b_gradient = 0 
    m_gradient = 0 
    N = int(len(x)) 
    for i in range(0, N): 
     b_gradient += -(1/N) * (y[i] - ((m_current*x[i]) + b_current)) 
     m_gradient += -(1/N) * x[i] * (y[i] - ((m_current * x[i]) + b_current)) 
    while abs(b_gradient) > 0.01 and abs(m_gradient) > 0.01: 
     b_current = b_current - (step * b_gradient) 
     m_current = m_current - (step * m_gradient) 
     for i in range(0, N): 
      b_gradient += -(1/N) * (y[i] - ((m_current*x[i]) + b_current)) 
      m_gradient += -(1/N) * x[i] * (y[i] - ((m_current * x[i]) + b_current)) 
    return [b_current, m_current] 

x = [1,2, 2,3,4,5,7,8] 
y = [1.5,3,1,3,2,5,6,7] 
step = 0.00001 
(b,m) = stepGradient(x,y,step) 


plt.scatter(x,y) 
abline_values = [m * i + b for i in x] 
plt.plot(x, abline_values, 'b') 
plt.show()

固定：D

import matplotlib.pyplot as plt 
def stepGradient(x,y): 
    step = 0.001 
    b_current = 0 
    m_current = 0 
    b_gradient = 0 
    m_gradient = 0 
    N = int(len(x)) 
    for i in range(0, N): 
     b_gradient += -(1/N) * (y[i] - ((m_current*x[i]) + b_current)) 
     m_gradient += -(1/N) * x[i] * (y[i] - ((m_current * x[i]) + b_current)) 
    while abs(b_gradient) > 0.01 or abs(m_gradient) > 0.01: 
     b_current = b_current - (step * b_gradient) 
     m_current = m_current - (step * m_gradient) 
     b_gradient= 0 
     m_gradient = 0 
     for i in range(0, N): 
      b_gradient += -(1/N) * (y[i] - ((m_current*x[i]) + b_current)) 
      m_gradient += -(1/N) * x[i] * (y[i] - ((m_current * x[i]) + b_current)) 
    return [b_current, m_current] 

x = [1,2, 2,3,4,5,7,8,10] 
y = [1.5,3,1,3,2,5,6,7,20] 
(b,m) = stepGradient(x,y) 


plt.scatter(x,y) 
abline_values = [m * i + b for i in x] 
plt.plot(x, abline_values, 'b') 
plt.show()

來源

2016-12-17 Ahmad Abdelzaher

與梯度下降的事情是，它很少達到0的衍生物。這個過程在梯度很高的時候工作得很好，但是當它發生很小的變化時，它表明這個過程將圍繞最佳點進行盤旋。嘗試在while循環中寫入一個極限或使導數大於一個小的ε值（如0.0001）。 –

「輸出看起來不正確」是什麼意思？顯示預期的輸出和實際獲得的輸出（控制檯輸出，回溯，圖表等）。您提供的細節越多，您可能會收到更好的答案。檢查[FAQ]（http://stackoverflow.com/tour）和[如何提問]（http://stackoverflow.com/help/how-to-ask）。 –

您的while循環僅當計算器浮點浮點值等於零。這是天真的，因爲浮點值很少精確計算。相反，當計算值爲足夠接近爲零時停止循環。使用類似

while math.abs(derivative(optimum)) > eps:

其中eps是計算值的所需精度。這可以作爲另一個參數，也許有一個默認值1e-10或其他一些。

這就是說，你的情況的問題更糟。你的算法是過於天真的假設，計算

optimum = optimum - 2*derivative(optimum)

將移動的optimum接近實際最優值的值。在您的具體情況下，變量optimum只是在5（您的初始猜測）和-5之間來回循環。請注意0的衍生物爲10，而-5的衍生物爲-10。

所以你需要避免這種循環。你可以乘以你的增量2*derivative(optimum)小於1，這可以在你的特定情況下工作y=x**2。但這通常不起作用。

爲了完全安全，請使用較小值和較大值「括起」您的最佳點，然後使用導數來查找下一個猜測。但請確保您的下一個猜測不會超出括號內的時間間隔。如果確實如此，或者猜測的收斂速度太慢，請使用另一種方法，例如平分或黃金分割搜索。

當然，這意味着你的'非常樸素的梯度下降'算法一般來說太天真了。這就是爲什麼真正的優化程序更復雜。

來源

2016-12-17 20:43:52

謝謝。我剛剛嘗試過，但循環仍然繼續。 –

主題已更新。 –

對不起，我以爲ppl只是運行代碼，我會很快用圖表更新它。 –

您還需要降低步長（伽馬梯度下降公式）：

y = lambda x : x**2 
dy_dx = lambda x : 2*x 
def gradient_descent(function,derivative,initial_guess): 
    optimum = initial_guess 
    while abs(derivative(optimum)) > 0.01: 
     optimum = optimum - 0.01*derivative(optimum) 
     print((optimum,derivative(optimum))) 
    else: 
     return optimum

來源

2016-12-17 21:26:38 rofls

謝謝，算法有效，但返回不起作用，我如何使函數返回最終的最佳值 –

已更新主題。 –

在python中實現天真的漸變下降

回答

相關問題