我一直在看深層學習/卷積神經網絡上的一些視頻,如here和here,我試圖用C++實現我自己的視頻。我試圖保持輸入數據非常簡單,因爲我的第一次嘗試是想區分十字和圓,我有一個每個大約25(64 * 64圖像)的小數據集,它們看起來像這樣:卷積神經網絡不收斂
網絡本身是五層:
Convolution (5 filters, size 3, stride 1, with a ReLU)
MaxPool (size 2)
Convolution (1 filter, size 3, stride 1, with a ReLU)
MaxPool (size 2)
Linear Regression classifier
我的問題是,我的網絡不收斂,任何東西。權重似乎沒有改變。如果我運行它,那麼預測大部分保持不變,而不是在返回下一次迭代之前跳出的偶爾異常值。
卷積層的訓練看起來是這樣的,去掉了一些循環,使之清潔
// Yeah, I know I should change the shared_ptr<float>
void ConvolutionalNetwork::Train(std::shared_ptr<float> input,std::shared_ptr<float> outputGradients, float label)
{
float biasGradient = 0.0f;
// Calculate the deltas with respect to the input.
for (int layer = 0; layer < m_Filters.size(); ++layer)
{
// Pseudo-code, each loop on it's own line in actual code
For z < depth, x <width - filterSize, y < height -filterSize
{
int newImageIndex = layer*m_OutputWidth*m_OutputHeight+y*m_OutputWidth + x;
For the bounds of the filter (U,V)
{
// Find the index in the input image
int imageIndex = x + (y+v)*m_OutputWidth + z*m_OutputHeight*m_OutputWidth;
int kernelIndex = u +v*m_FilterSize + z*m_FilterSize*m_FilterSize;
m_pGradients.get()[imageIndex] += outputGradients.get()[newImageIndex]*input.get()[imageIndex];
m_GradientSum[layer].get()[kernelIndex] += m_pGradients.get()[imageIndex] * m_Filters[layer].get()[kernelIndex];
biasGradient += m_GradientSum[layer].get()[kernelIndex];
}
}
}
// Update the weights
for (int layer = 0; layer < m_Filters.size(); ++layer)
{
For z < depth, U & V < filtersize
{
// Find the index in the input image
int kernelIndex = u +v*m_FilterSize + z*m_FilterSize*m_FilterSize;
m_Filters[layer].get()[kernelIndex] -= learningRate*m_GradientSum[layer].get()[kernelIndex];
}
m_pBiases.get()[layer] -= learningRate*biasGradient;
}
}
所以,我創建了一個緩衝(m_pGradients),這是輸入緩衝喂梯度的尺寸回上一層,但使用梯度和來調整權重。
最大池計算梯度回像這樣(這樣可以節省最高指數和零所有其他梯度出)
void MaxPooling::Train(std::shared_ptr<float> input,std::shared_ptr<float> outputGradients, float label)
{
for (int outputVolumeIndex = 0; outputVolumeIndex <m_OutputVolumeSize; ++outputVolumeIndex)
{
int inputIndex = m_Indices.get()[outputVolumeIndex];
m_pGradients.get()[inputIndex] = outputGradients.get()[outputVolumeIndex];
}
}
,最終迴歸層計算其梯度是這樣的:
void LinearClassifier::Train(std::shared_ptr<float> data,std::shared_ptr<float> output, float y)
{
float * x = data.get();
float biasError = 0.0f;
float h = Hypothesis(output) - y;
for (int i =1; i < m_NumberOfWeights; ++i)
{
float error = h*x[i];
m_pGradients.get()[i] = error;
biasError += error;
}
float cost = h;
m_Error = cost*cost;
for (int theta = 1; theta < m_NumberOfWeights; ++theta)
{
m_pWeights.get()[theta] = m_pWeights.get()[theta] - learningRate*m_pGradients.get()[theta];
}
m_pWeights.get()[0] -= learningRate*biasError;
}
對這兩個例子進行100次迭代訓練後,每個訓練的預測與其他訓練的預測相同,並且從一開始就不變。
- 像這樣的卷積網絡應該能夠區分這兩個類嗎?
- 這是正確的方法嗎?
- 我應該計算卷積層反向傳播中的ReLU(max)嗎?
謝謝!我會盡力回覆你。圖像不居中,顏色不同,等等,所以我認爲線性分類器在整個測試集上都會失敗。但我會試試這兩個。 – Davors72