1

當計算誤差導數時,我正在使用以下工作,但不確切地知道爲什麼。反向傳播算法 - 誤差導數計算

double errorDerivative = (-output * (1-output) *(desiredOutput - output)); 

當我從第一個輸出中刪除負數時,它會失敗並達到最大曆元極限。我假設這是它看起來應該看起來像這樣的示例http://homepages.gold.ac.uk/nikolaev/311imlti.htm,它不使用減號運算符。

double errorDerivative2 = (output * (1-output) *(desiredOutput - output)); 

目前,我正在尋找在使用隨機梯度下降,並希望只讓使用標準的反向傳播alorithm的修改現有的BP執行。目前,它看起來像這樣。

public void applyBackpropagation(double expectedOutput[]) { 

     // error check, normalize value ]0;1[ 
     /*for (int i = 0; i < expectedOutput.length; i++) { 
      double d = expectedOutput[i]; 
      if (d < 0 || d > 1) { 
       if (d < 0) 
        expectedOutput[i] = 0 + epsilon; 
       else 
        expectedOutput[i] = 1 - epsilon; 
      } 
     }*/ 

     int i = 0; 
     for (Neuron n : outputLayer) { 
      System.out.println("neuron"); 
      ArrayList<Connection> connections = n.getAllInConnections(); 
      for (Connection con : connections) { 
       double output = n.getOutput(); 
       System.out.println("final output is "+output); 
       double ai = con.leftNeuron.getOutput(); 
       System.out.println("ai output is "+ai); 
       double desiredOutput = expectedOutput[i]; 

       double errorDerivative = (-output * (1-output) *(desiredOutput - output)); 
       double errorDerivative2 = (output * (1-output) *(desiredOutput - output)); 
       System.out.println("errorDerivative is "+errorDerivative); 
       System.out.println("errorDerivative my one is "+(output * (1-output) *(desiredOutput - output))); 
       double deltaWeight = -learningRate * errorDerivative2; 
       double newWeight = con.getWeight() + deltaWeight; 
       con.setDeltaWeight(deltaWeight); 
       con.setWeight(newWeight + momentum * con.getPrevDeltaWeight()); 
      } 
      i++; 
     } 

     // update weights for the hidden layer 
     for (Neuron n : hiddenLayer) { 
      ArrayList<Connection> connections = n.getAllInConnections(); 
      for (Connection con : connections) { 
       double output = n.getOutput(); 
       double ai = con.leftNeuron.getOutput(); 
       double sumKoutputs = 0; 
       int j = 0; 
       for (Neuron out_neu : outputLayer) { 
        double wjk = out_neu.getConnection(n.id).getWeight(); 
        double desiredOutput = (double) expectedOutput[j]; 
        double ak = out_neu.getOutput(); 
        j++; 
        sumKoutputs = sumKoutputs 
          + (-(desiredOutput - ak) * ak * (1 - ak) * wjk); 
       } 

       double partialDerivative = output * (1 - output) * ai * sumKoutputs; 
       double deltaWeight = -learningRate * partialDerivative; 
       double newWeight = con.getWeight() + deltaWeight; 
       con.setDeltaWeight(deltaWeight); 
       con.setWeight(newWeight + momentum * con.getPrevDeltaWeight()); 
      } 
     } 
    } 
+2

問題是什麼,你會喜歡回答?你在問爲什麼這個公式是什麼?或者你想讓人們查看你的代碼? – 2012-02-26 23:32:02

+0

查看代碼以及爲什麼errorDerivative2不起作用,但errorDerivative可以工作。 – unleashed 2012-02-27 00:23:07

回答

2

對不起,我不會檢查你的代碼 - 沒有時間,你將不得不回來更具體的問題,然後我可以幫你。

原因errorDerivative2工作可能是您正在使用諸如
deltaW = learningRate*errorDerivative2*input

Normaly你稱之爲「errorDerivative2」被稱爲三角洲和被定義爲
-output * (1-output) *(desiredOutput - output)
的權重更新規則 用於與S型傳遞函數

與權重更新規則中的神經元
deltaW = -learningRate*delta*input

所以basicly它爲你而對errorDerivative2減號,因爲你已經在藏漢另一個地方留下了一個減號..