2016-11-06 33 views
0

一些背景:C++單層多輸出感知器怪異行爲

我在C++中編寫了一個單層多輸出感知器類。它使用典型的WX + b判別式功能並允許用戶定義的激活功能。我已經完全測試了所有的東西,而且這一切似乎都在按照我的預期工作。我注意到我的代碼中存在一個小的邏輯錯誤,當我嘗試修復它時,網絡的性能比以前差得多。錯誤如下:

我用下面的代碼評估在每個輸出神經元的值:

output[i] = 
      activate_(std::inner_product(weights_[i].begin(), weights_[i].end(), 
             features.begin(), -1 * biases_[i])); 

在這裏,我把偏置輸入作爲固定-1,但是當我申請學習規則每個偏見,我把輸入視爲+1。

// Bias can be treated as a weight with a constant feature value of 1. 
biases_[i] = weight_update(1, error, learning_rate_, biases_[i]); 

所以我試圖修復通過改變調用我的錯誤weight_updated與產出評價被conistent:

biases_[i] = weight_update(-1, error, learning_rate_, biases_[i]); 

但這樣做的結果在精度下降20%! 過去幾天我一直在拉我的頭髮,試圖在我的代碼中發現一些其他邏輯錯誤,這可能會解釋這種奇怪的行爲,但卻是空手而歸。任何擁有更多知識的人都可以提供任何見解嗎?我已經提供了整個班級以供參考。先謝謝你。

#ifndef SINGLE_LAYER_PERCEPTRON_H 
#define SINGLE_LAYER_PERCEPTRON_H 

#include <cassert> 
#include <functional> 
#include <numeric> 
#include <vector> 
#include "functional.h" 
#include "random.h" 

namespace qp { 
namespace rf { 

namespace { 

template <typename Feature> 
double weight_update(const Feature& feature, const double error, 
        const double learning_rate, const double current_weight) { 
    return current_weight + (learning_rate * error * feature); 
} 

template <typename T> 
using Matrix = std::vector<std::vector<T>>; 

} // namespace 

template <typename Feature, typename Label, typename ActivationFn> 
class SingleLayerPerceptron { 
public: 
    // For testing only. 
    SingleLayerPerceptron(const Matrix<double>& weights, 
         const std::vector<double>& biases, double learning_rate) 
     : weights_(weights), 
     biases_(biases), 
     n_inputs_(weights.front().size()), 
     n_outputs_(biases.size()), 
     learning_rate_(learning_rate) {} 

    // Initialize the layer with random weights and biases in [-1, 1]. 
    SingleLayerPerceptron(std::size_t n_inputs, std::size_t n_outputs, 
         double learning_rate) 
     : n_inputs_(n_inputs), 
     n_outputs_(n_outputs), 
     learning_rate_(learning_rate) { 
    weights_.resize(n_outputs_); 
    std::for_each(
     weights_.begin(), weights_.end(), [this](std::vector<double>& wv) { 
      generate_back_n(wv, n_inputs_, 
          std::bind(random_real_range<double>, -1, 1)); 
     }); 

    generate_back_n(biases_, n_outputs_, 
        std::bind(random_real_range<double>, -1, 1)); 
    } 

    std::vector<double> predict(const std::vector<Feature>& features) const { 
    std::vector<double> output(n_outputs_); 
    for (auto i = 0ul; i < n_outputs_; ++i) { 
     output[i] = 
      activate_(std::inner_product(weights_[i].begin(), weights_[i].end(), 
             features.begin(), -1 * biases_[i])); 
    } 
    return output; 
    } 

    void learn(const std::vector<Feature>& features, 
      const std::vector<double>& true_output) { 
    const auto actual_output = predict(features); 
    for (auto i = 0ul; i < n_outputs_; ++i) { 
     const auto error = true_output[i] - actual_output[i]; 
     for (auto weight = 0ul; weight < n_inputs_; ++weight) { 
     weights_[i][weight] = weight_update(
      features[weight], error, learning_rate_, weights_[i][weight]); 
     } 
     // Bias can be treated as a weight with a constant feature value of 1. 
     biases_[i] = weight_update(1, error, learning_rate_, biases_[i]); 
    } 
    } 

private: 
    Matrix<double> weights_;  // n_outputs x n_inputs 
    std::vector<double> biases_; // 1 x n_outputs 
    std::size_t n_inputs_; 
    std::size_t n_outputs_; 
    ActivationFn activate_; 
    double learning_rate_; 
}; 

struct StepActivation { 
    double operator()(const double x) const { return x > 0 ? 1 : -1; } 
}; 

} // namespace rf 
} // namespace qp 

#endif /* SINGLE_LAYER_PERCEPTRON_H */ 

回答

0

我最後想出來的......

我修復確實是正確和準確的損失只是擁有一個幸運的(或不幸的)數據集的結果。