隨機梯度下降實現 - MATLAB

我試圖在MATLAB中實現「Stochastic gradient descent」。我完全按照算法，但我得到一個非常非常大的w（coffients）預測/擬合函數。算法中有錯誤嗎？隨機梯度下降實現 - MATLAB

算法： enter image description here

x = 0:0.1:2*pi  // X-axis 
    n = size(x,2);  
    r = -0.2+(0.4).*rand(n,1); //generating random noise to be added to the sin(x) function 

    t=zeros(1,n); 
    y=zeros(1,n); 



    for i=1:n 
     t(i)=sin(x(i))+r(i);   // adding the noise 
     y(i)=sin(x(i));    // the function without noise 
    end 

    f = round(1+rand(20,1)*n);  //generating random indexes 

    h = x(f);       //choosing random x points 
    k = t(f);       //chossing random y points 

    m=size(h,2);      // length of the h vector 

    scatter(h,k,'Red');    // drawing the training points (with noise) 
    %scatter(x,t,2); 
    hold on; 
    plot(x,sin(x));     // plotting the Sin function 


    w = [0.3 1 0.5];     // starting point of w 
    a=0.05;       // learning rate "alpha" 

// ---------------- ALGORITHM ---------------------// 
    for i=1:20 
     v = [1 h(i) h(i).^2];      // X vector 
     e = ((w*v') - k(i)).*v;   // prediction - observation 
     w = w - a*e;      // updating w 
    end 

    hold on; 

    l = 0:1:6; 
    g = w(1)+w(2)*l+w(3)*(l.^2); 
    plot(l,g,'Yellow');      // drawing the prediction function

來源

2011-02-25 Morano88

如果您使用的學費太高，SGD很可能會出現分歧。
學習率應該趨於零。

來源

2011-02-26 16:21:29

通常，當w結束了值過大，有過度擬合。我沒有真正仔細查看你的代碼。但我認爲，你的代碼中缺少的是一個適當的正則化術語，它可以防止過度訓練。另外，這裏：

e = ((w*v') - k(i)).*v;

這裏的v不是預測值的梯度，不是嗎？根據算法，你應該替換它。讓我們看看這樣做後會是怎樣。

來源

2011-02-26 07:01:31

隨機梯度下降實現 - MATLAB

回答

相關問題