神經網絡輸出層的矢量化公式

我有一個神經網絡，想用訓練好的神經網絡來求解一組測試數據。我正在努力爲隱藏層和輸出層寫公式。我的目標是製作一個矢量化公式，但我也很樂意實現一個循環變化。神經網絡輸出層的矢量化公式

現在我相信我有隱藏層的正確公式，只需要一個用於輸出層，但是如果有人確認它是向量化公式，將會很感激。

% Variables 
% Xtest test training data 
% thetah - trained weights for inputs to hidden layer 
% thetao - trained weights for hidden layer to outputs 
% ytest - output 

htest = (1 ./ (1 + exp(-(thetah * Xtest'))))' ; % FORMULA FOR HIDDEN LAYER 
ytest = ones(mtest, num_outputs) ; % FORMULA FOR OUTPUT LAYER

來源

2015-12-27 Jean de Toit

哪個公式應得到確認？你的代碼中的ytest表達式只是初始化一個新的矩陣，並且肯定是不正確的。你會發布你到目前爲止？ Xtest的維度是什麼？它是一個矢量還是一組輸入矢量？ – Anton

htest應該確認，目前ytest只是一個佔位符代碼，它會給出正確的尺寸，Xtest是6,7，其餘的是6,6 –

下面您可以找到向前傳播的向量化和循環實現。由於不同的符號和你在矩陣中存儲數據的方式，你的輸入數據必須適應下面的代碼是可能的。

您需要向輸入層和隱藏層添加偏置單位。

爲了簡化實施和調試工作我花了一些數據來自於開源machine learning repository和訓練有素的網絡the wine classification task。

Xtest - 輸入數據[178x13]
y - 輸出類[178x1]
thetah - 隱藏層的參數[15x14]
thetao - 輸出層的參數[3x16]

網絡將輸入數據分隔率97.7％

下面是代碼：

function [] = nn_fp() 

    load('Xtest.mat'); %input data 178x13 
    load('y.mat'); %output data 178x1 
    load('thetah.mat'); %Parameters of the hidden layer 15x14 
    load('thetao.mat'); %Parameters of the output layer 3x16 

    predict_simple(Xtest, y, thetah, thetao); 

    predict_vectorized(Xtest, y, thetah, thetao); 
end 

function predict_simple(Xtest, y, thetah, thetao) 

    mtest = size(Xtest, 1); %number of input examples 
    n = size(Xtest, 2); %number of features 
    hl_size = size(thetah, 1); %size of the hidden layer (without the bias unit) 
    num_outputs = size(thetao, 1); %size of the output layer 

    %add a bias unit to the input layer 
    a1 = [ones(mtest, 1) Xtest]; %[mtest x (n+1)] 

    %compute activations of the hidden layer 
    z2 = zeros(mtest, hl_size); %[mtest x hl_size] 
    a2 = zeros(mtest, hl_size); %[mtest x hl_size] 

    for i=1:mtest 
     for j=1:hl_size 
      for k=1:n+1 
       z2(i, j) = z2(i, j) + a1(i, k)*thetah(j, k); 
      end 

      a2(i, j) = sigmoid_simple(z2(i, j)); 
     end 
    end 

    %add a bias unit to the hidden layer 
    a2 = [ones(mtest, 1) a2]; %[mtest x (hl_size+1)] 

    %compute activations of the output layer 
    z3 = zeros(mtest, num_outputs); %[mtest x num_outputs] 
    h = zeros(mtest, num_outputs); %[mtest x num_outputs] 

    for i=1:mtest 
     for j=1:num_outputs 
      for k=1:hl_size+1 
       z3(i, j) = z3(i, j) + a2(i, k)*thetao(j, k); 
      end 

      h(i, j) = sigmoid_simple(z3(i, j)); %the hypothesis 
     end 
    end 

    %calculate predictions for each input example based on the maximum term 
    %of the hypothesis h 
    p = zeros(size(y)); 

    for i=1:mtest 
     max_ind = 1; 
     max_value = h(i, 1); 
     for j=2:num_outputs 
      if (h(i, j) > max_value) 
       max_ind = j; 
       max_value = h(i, j); 
      end 
     end 

     p(i) = max_ind; 
    end 

    %calculate the success rate of the prediction 
    correct_count = 0; 
    for i=1:mtest 
     if (p(i) == y(i)) 
      correct_count = correct_count + 1; 
     end 
    end 

    rate = correct_count/mtest*100; 

    display(['simple version rate:', num2str(rate)]); 
end 

function predict_vectorized(Xtest, y, thetah, thetao) 

    mtest = size(Xtest, 1); %number of input examples 

    %add a bias unit to the input layer 
    a1 = [ones(mtest, 1) Xtest]; 

    %compute activations of the hidden layer 
    z2 = a1*thetah'; 
    a2 = sigmoid_universal(z2); 

    %add a bias unit to the hidden layer 
    a2 = [ones(mtest, 1) a2]; 

    %compute activations of the output layer 
    z3 = a2*thetao'; 
    h = sigmoid_universal(z3); %the hypothesis 

    %calculate predictions for each input example based on the maximum term 
    %of the hypothesis h 
    [~,p] = max(h, [], 2); 
    %calculate the success rate of the prediction 
    rate = mean(double((p == y))) * 100; 
    display(['vectorized version rate:', num2str(rate)]); 
end 

function [ s ] = sigmoid_simple(z) 
    s = 1/(1+exp(-z)); 
end 

function [ s ] = sigmoid_universal(z) 
    s = 1./(1+exp(-z)); 
end

來源

2015-12-30 00:29:10 Anton

假設你Xtest具有尺寸N by M其中N是實施例中的數量，M是特徵的數量，thetah是M by H1矩陣，其中H1爲隱藏層中的第一層的數目和thetao是H1 by O矩陣，其中O是跟着你做輸出類的數量：

a1 = Xtest * thetah; 
z1 = 1/(1 + exp(-a1)); %Assuming you are using sigmoid units 

a2 = z1 * thetao; 
z2 = softmax(a2);

瞭解更多關於SOFTMAX here。

來源

2015-12-27 15:38:20 Amir

神經網絡輸出層的矢量化公式

回答

相關問題