有關使用keras進行batening規範的dnn層的理論問題

我有一些麻煩了解使用batchnormalization的DNN模型，在使用keras的詳細說明中。有人可以向我解釋我構建的這個模型中每一層的結構和內容嗎？有關使用keras進行batening規範的dnn層的理論問題

modelbatch = Sequential() 
modelbatch.add(Dense(512, input_dim=1120)) 
modelbatch.add(BatchNormalization()) 
modelbatch.add(Activation('relu')) 
modelbatch.add(Dropout(0.5)) 

modelbatch.add(Dense(256)) 
modelbatch.add(BatchNormalization()) 
modelbatch.add(Activation('relu')) 
modelbatch.add(Dropout(0.5)) 

modelbatch.add(Dense(num_classes)) 
modelbatch.add(BatchNormalization()) 
modelbatch.add(Activation('softmax')) 
# Compile model 
modelbatch.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) 
# Train the model 
start = time.time() 
model_info = modelbatch.fit(X_2, y_2, batch_size=500, \ 
         epochs=20, verbose=2, validation_data=(X_test, y_test)) 
end = time.time()

這一點，我想，我的模型的所有層：

print(modelbatch.layers[0].get_weights()[0].shape) 
(1120, 512) 
print(modelbatch.layers[0].get_weights()[1].shape) 
(512,) 
print(modelbatch.layers[1].get_weights()[0].shape) 
(512,) 
print(modelbatch.layers[1].get_weights()[1].shape) 
(512,) 
print(modelbatch.layers[1].get_weights()[2].shape) 
(512,) 
print(modelbatch.layers[1].get_weights()[3].shape) 
(512,) 
print(modelbatch.layers[4].get_weights()[0].shape) 
(512, 256) 
print(modelbatch.layers[4].get_weights()[1].shape) 
(256,) 
print(modelbatch.layers[5].get_weights()[0].shape) 
(256,) 
print(modelbatch.layers[5].get_weights()[1].shape) 
(256,) 
print(modelbatch.layers[5].get_weights()[2].shape) 
(256,) 
print(modelbatch.layers[5].get_weights()[3].shape) 
(256,) 
print(modelbatch.layers[8].get_weights()[0].shape) 
(256, 38) 
print(modelbatch.layers[8].get_weights()[1].shape) 
(38,) 
print(modelbatch.layers[9].get_weights()[0].shape) 
(38,) 
print(modelbatch.layers[9].get_weights()[1].shape) 
(38,) 
print(modelbatch.layers[9].get_weights()[2].shape) 
(38,) 
print(modelbatch.layers[9].get_weights()[3].shape) 
(38,)

我會感謝您的幫助，在此先感謝。

來源

2017-08-17 Jorge Mariano Mamani Soria

讓我們通過模型：

您有尺寸1120的輸入層，連接到一個，你有512元的第一隱藏層，你有你的批標準化層之後。之後，你的激活功能和之後，你的失落層。請注意，您可以使用命令model.summary()可視化您的模型

理論上，您可以（也應該）只考慮這些圖層，就像應用以下轉換的一個圖層：批量標準化，激活和丟棄。在實踐中，每個圖層都是在Keras中單獨實現的，因爲您可以通過模塊化獲得實現：不用編碼設計圖層的所有可能方式，用戶可以選擇添加到圖層批處理標準或退出。若要查看模塊化實施方案，我建議您查看http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture4.pdf，如果您想獲得更深入的知識，則一般需要登錄http://cs231n.stanford.edu/syllabus.html。

對於批處理標準化層，您可以注意到4個參數：兩個可調參數：gamma和beta以及兩個由數據設置的參數（平均值和標準偏差）。要了解它是什麼，請查看斯坦福大學的課程，還可以在關於批次標準化https://arxiv.org/abs/1502.03167的原始論文中找到它。這只是一個技巧，通過在每一層對數據進行規範化來提高學習速度並提高準確性，就像您在輸入數據的預處理步驟中所做的一樣。

從我所說的，你可以推斷出你的模型的其餘部分。

N-B：我不會在softmax之前的最後一步使用batchnormalization圖層。

更清楚了嗎？

來源

2017-08-17 09:40:11 Nathan

是的，謝謝，它更清晰，但只是批量標準化的4個參數，我不知道是否可以用另一個數據來評估（如何知道，可以將模型保存在keras中或如果是簡單的DNN，你可以通過model.layers.get_weights（）來獲取權重和偏差來評估另一個數據），所以，我希望做同樣的事情，在這種情況下使用批量規範化，但我不知道所有圖層中的哪一個需要在另一個環境中進行評估？提前致謝！ –

您的意思是您希望使用您學習的模型在沒有Keras API的情況下進行預測，並且您想將所有權重和體系結構複製到其他項目中？ – Nathan

是的，就像用一個簡單的DNN例子一樣，我得到這個權重：'weights1 = modelbatch.layers [0] .get_weights（）[0]'＃1隱藏層 'biases1 = ...' '權重2 = modelbatch.layers [1] .get_weights（）[0]'#The 2 hidden layer 'biases2 = .....' 'weights3 = modelbatch.layers [4] .get_weights（）[0]'#The輸出層 'biases3 = ....' –

有關使用keras進行batening規範的dnn層的理論問題

回答

相關問題