異或與神經網絡（Matlab）

所以，我希望這是一個真正的愚蠢的事情，我正在做，並有一個簡單的答案。我試圖訓練一個2x3x1神經網絡來完成XOR問題。它沒有工作，所以我決定深入瞭解發生了什麼。最後，我決定分配我的自我重量。這是我想出的權重向量：異或與神經網絡（Matlab）

theta1 = [11 0 -5; 0 12 -7;18 17 -20]; 
theta2 = [14 13 -28 -6];

（在Matlab中表示法）。我故意試圖使沒有兩個權重是一樣的（除非在零）

而且，我的代碼，在MATLAB很簡單的是

function layer2 = xornn(iters) 
    if nargin < 1 
     iters = 50 
    end 
    function s = sigmoid(X) 
     s = 1.0 ./ (1.0 + exp(-X)); 
    end 
    T = [0 1 1 0]; 
    X = [0 0 1 1; 0 1 0 1; 1 1 1 1]; 
    theta1 = [11 0 -5; 0 12 -7;18 17 -20]; 
    theta2 = [14 13 -28 -6]; 
    for i = [1:iters] 
     layer1 = [sigmoid(theta1 * X); 1 1 1 1]; 
     layer2 = sigmoid(theta2 * layer1) 
     delta2 = T - layer2; 
     delta1 = layer1 .* (1-layer1) .* (theta2' * delta2); 
     % remove the bias from delta 1. There's no real point in a delta on the bias. 
     delta1 = delta1(1:3,:); 
     theta2d = delta2 * layer1'; 
     theta1d = delta1 * X'; 
     theta1 = theta1 - 0.1 * theta1d; 
     theta2 = theta2 - 0.1 * theta2d; 
    end 
end

我相信這是正確的。我用有限差分方法測試了各種參數（theta），看看它們是否正確，而且它們似乎是正確的。

但是，當我運行它時，它最終只歸結爲返回全零。如果我做xornn（1）（1次迭代）我得到

0.0027 0.9966 0.9904 0.0008

但是，如果我做xornn（35）

0.0026 0.9949 0.9572 0.0007

（它開始在錯誤的方向上的後裔）和由（45）我得到

0.0018 0.0975 0.0000 0.0003

如果我運行它10,000次迭代，它只是返回全0。

這是怎麼回事？我必須添加正規化嗎？我會認爲這樣一個簡單的網絡不需要它。但是，無論如何，爲什麼它擺脫了我親手餵食的明顯的良好解決方案？

謝謝！

來源

2015-05-11 bnsh

AAARRGGHHH！該解決方案是簡單地改變

theta1 = theta1 - 0.1 * theta1d; 
theta2 = theta2 - 0.1 * theta2d;

到

theta1 = theta1 + 0.1 * theta1d; 
theta2 = theta2 + 0.1 * theta2d;

嘆息

現在壽的問題，我需要弄清楚如何我計算了負的時，不知何故我什麼以爲我在計算是......沒關係。無論如何，我會在這裏發帖，以防萬一它幫助別人。

所以，z =是sigmoid輸入的總和，y是sigmoid的輸出。

C = -(T * Log[y] + (1-T) * Log[(1-y)) 

dC/dy = -((T/y) - (1-T)/(1-y)) 
     = -((T(1-y)-y(1-T))/(y(1-y))) 
     = -((T-Ty-y+Ty)/(y(1-y))) 
     = -((T-y)/(y(1-y))) 
     = ((y-T)/(y(1-y))) # This is the source of all my woes. 
dy/dz = y(1-y) 
dC/dz = ((y-T)/(y(1-y))) * y(1-y) 
     = (y-T)

因此，問題是，我不小心被計算的T-Y，因爲我忘了在成本函數中的前面的負號。然後，我正在減去我認爲的漸變，但實際上是負漸變。在那裏。那就是問題所在。

一旦我做到了：

function layer2 = xornn(iters) 
    if nargin < 1 
     iters = 50 
    end 
    function s = sigmoid(X) 
     s = 1.0 ./ (1.0 + exp(-X)); 
    end 
    T = [0 1 1 0]; 
    X = [0 0 1 1; 0 1 0 1; 1 1 1 1]; 
    theta1 = [11 0 -5; 0 12 -7;18 17 -20]; 
    theta2 = [14 13 -28 -6]; 
    for i = [1:iters] 
     layer1 = [sigmoid(theta1 * X); 1 1 1 1]; 
     layer2 = sigmoid(theta2 * layer1) 
     delta2 = T - layer2; 
     delta1 = layer1 .* (1-layer1) .* (theta2' * delta2); 
     % remove the bias from delta 1. There's no real point in a delta on the bias. 
     delta1 = delta1(1:3,:); 
     theta2d = delta2 * layer1'; 
     theta1d = delta1 * X'; 
     theta1 = theta1 + 0.1 * theta1d; 
     theta2 = theta2 + 0.1 * theta2d; 
    end 
end

xornn（50）返回0.0028 0.9972 0.9948 0.0009和 xornn（10000）返回0.0016 0.9989 0.9993 0。0005

唷！也許這會幫助別人調試他們的版本..

來源

2015-05-11 19:40:20 bnsh

異或與神經網絡（Matlab）

回答

相關問題