驗證的準確性總是大於Keras中的訓練準確性

我想用mnist數據集訓練一個簡單的神經網絡。出於某種原因，當我獲得歷史記錄（從model.fit返回的參數）時，驗證準確性高於訓練準確性，這非常奇怪，但是如果在評估模型時檢查得分，則得到更高訓練的準確性比測試準確。驗證的準確性總是大於Keras中的訓練準確性

出現這種情況，每次，不管模型的參數。另外，如果我使用自定義回調並訪問參數'acc'和'val_acc'，則會發現相同的問題（數字與歷史記錄中返回的數字相同）。

請幫幫我！我究竟做錯了什麼？爲什麼驗證的準確度高於培訓準確度（您可以看到我在查看損失時有同樣的問題）。

這是我的代碼：

#!/usr/bin/env python3.5 

from keras.layers import Dense, Dropout, Activation, Flatten 
from keras.layers import Conv2D, MaxPooling2D 
import numpy as np 
from keras import backend 
from keras.utils import np_utils 
from keras import losses 
from keras import optimizers 
from keras.datasets import mnist 
from keras.models import Sequential 
from matplotlib import pyplot as plt 

# get train and test data (minst) and reduce volume to speed up (for testing) 
(x_train, y_train), (x_test, y_test) = mnist.load_data() 
data_reduction = 20 
x_train = x_train[:x_train.shape[0] // data_reduction] 
y_train = y_train[:y_train.shape[0] // data_reduction] 
x_test = x_test[:x_test.shape[0] // data_reduction] 
y_test = y_test[:y_test.shape[0] // data_reduction] 
try: 
    IMG_DEPTH = x_train.shape[3] 
except IndexError: 
    IMG_DEPTH = 1 # B/W 
labels = np.unique(y_train) 
N_LABELS = len(labels) 
# reshape input data 
if backend.image_data_format() == 'channels_first': 
    X_train = x_train.reshape(x_train.shape[0], IMG_DEPTH, x_train.shape[1], x_train.shape[2]) 
    X_test = x_test.reshape(x_test.shape[0], IMG_DEPTH, x_train.shape[1], x_train.shape[2]) 
    input_shape = (IMG_DEPTH, x_train.shape[1], x_train.shape[2]) 
else: 
    X_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], IMG_DEPTH) 
    X_test = x_test.reshape(x_test.shape[0], x_train.shape[1], x_train.shape[2], IMG_DEPTH) 
    input_shape = (x_train.shape[1], x_train.shape[2], IMG_DEPTH) 
# convert data type to float32 and normalize data values to range [0, 1] 
X_train = X_train.astype('float32') 
X_test = X_test.astype('float32') 
X_train /= 255 
X_test /= 255 
# reshape input labels 
Y_train = np_utils.to_categorical(y_train, N_LABELS) 
Y_test = np_utils.to_categorical(y_test, N_LABELS) 

# create model 
opt = optimizers.Adam() 
loss = losses.categorical_crossentropy 
model = Sequential() 
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape)) 
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu')) 
model.add(MaxPooling2D(pool_size=(2, 2))) 
model.add(Dropout(0.25)) 
model.add(Flatten()) 
model.add(Dense(32, activation='relu')) 
model.add(Dropout(0.5)) 
model.add(Dense(len(labels), activation='softmax')) 
model.compile(optimizer=optimizers.Adam(), loss=losses.categorical_crossentropy, metrics=['accuracy']) 
# fit model 
history = model.fit(X_train, Y_train, batch_size=64, epochs=50, verbose=True, 
        validation_data=(X_test, Y_test)) 
# evaluate model 
train_score = model.evaluate(X_train, Y_train, verbose=True) 
test_score = model.evaluate(X_test, Y_test, verbose=True) 

print("Validation:", test_score[1]) 
print("Training: ", train_score[1]) 
print("--------------------") 
print("First 5 samples validation:", history.history["val_acc"][0:5]) 
print("First 5 samples training:", history.history["acc"][0:5]) 
print("--------------------") 
print("Last 5 samples validation:", history.history["val_acc"][-5:]) 
print("Last 5 samples training:", history.history["acc"][-5:]) 

# plot history 
plt.ion() 
fig = plt.figure() 
subfig = fig.add_subplot(122) 
subfig.plot(history.history['acc'], label="training") 
if history.history['val_acc'] is not None: 
    subfig.plot(history.history['val_acc'], label="validation") 
subfig.set_title('Model Accuracy') 
subfig.set_xlabel('Epoch') 
subfig.legend(loc='upper left') 
subfig = fig.add_subplot(121) 
subfig.plot(history.history['loss'], label="training") 
if history.history['val_loss'] is not None: 
    subfig.plot(history.history['val_loss'], label="validation") 
subfig.set_title('Model Loss') 
subfig.set_xlabel('Epoch') 
subfig.legend(loc='upper left') 
plt.ioff() 

input("Press ENTER to close the plots...")

輸出我得到的是以下幾點：

Validation accuracy: 0.97599999999999998 
Training accuracy: 1.0 
-------------------- 
First 5 samples validation: [0.83400000286102294, 0.89200000095367427, 0.91599999904632567, 0.9279999976158142, 0.9399999990463257] 
First 5 samples training: [0.47133333333333333, 0.70566666682561241, 0.76933333285649619, 0.81133333333333335, 0.82366666714350378] 
-------------------- 
Last 5 samples validation: [0.9820000019073486, 0.9860000019073486, 0.97800000190734859, 0.98399999713897701, 0.975999997138977] 
Last 5 samples training: [0.9540000001589457, 0.95766666698455816, 0.95600000031789145, 0.95100000031789145, 0.95033333381017049]

在這裏你可以看到我得到的情節： Training and Validation accuracy and loss plots

我不知道如果這是相關的，但我使用python 3.5和keras 2.0.4。

來源

2017-07-17 danidc

過度擬合應該會使訓練錯誤增加，驗證錯誤降低，反之亦然。 – danidc

從Keras FAQ：

爲什麼培訓的損失比測試損失要高得多？

Keras模型有兩種模式：訓練和測試。正常化機制，例如Dropout和L1/L2重量正則化，在測試時關閉。

此外，培訓損失是每批培訓數據的平均損失。因爲你的模型隨着時間的推移而變化，所以一個時代的第一批的損失通常比最後一批的損失要高。另一方面，使用該模型計算時期的測試損失，因爲它在時期結束時計算，導致較低的損失。

所以你看到的行爲並不像看到ML理論後看起來那麼不尋常。這也解釋了當你在同一個模型上同時評估訓練和測試集時，你突然會得到預期的行爲（train acc> val acc）。我猜想在你的情況下，dropout的存在尤其會妨礙訓練期間的準確性達到1.0，而在評估（測試）期間達到這個水平。

您可以通過添加回調來進一步調查，該回調可以在每個時期保存模型。然後你可以使用這兩套評估每個保存的模型來重新創建你的圖。

來源

2017-07-24 09:43:57 5Ke

你是對的，原因必須是輟學正規化者。在計算訓練精度（或損失）時，我們只使用網絡中遺漏的部分，因此訓練精度看起來小於驗證精度，因爲在第一個練習中我們不使用整個網絡，而在我們做的第二個。謝謝，我已經堅持了幾個星期！ – danidc

驗證的準確性總是大於Keras中的訓練準確性

回答

相關問題