作爲一個實驗,我構建了一個keras模型來逼近矩陣的行列式。但是,當我運行它時,每個時代的損失都會下降,驗證損失會上升!例如:如何用keras近似行列式
8s - loss: 7573.9168 - val_loss: 21831.5428
Epoch 21/50
8s - loss: 7345.0197 - val_loss: 23594.8540
Epoch 22/50
13s - loss: 7087.7454 - val_loss: 24718.3967
Epoch 23/50
7s - loss: 6851.8714 - val_loss: 25624.8609
Epoch 24/50
6s - loss: 6637.8168 - val_loss: 26616.7835
Epoch 25/50
7s - loss: 6446.8898 - val_loss: 28856.9654
Epoch 26/50
7s - loss: 6255.7414 - val_loss: 30122.7924
Epoch 27/50
7s - loss: 6054.5280 - val_loss: 32458.5306
Epoch 28/50
下面是完整的代碼:
import numpy as np
import sys
from scipy.stats import pearsonr
from scipy.linalg import det
from sklearn.model_selection import train_test_split
from tqdm import tqdm
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
import math
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from keras import backend as K
def baseline_model():
# create model
model = Sequential()
model.add(Dense(200, input_dim=n**2, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, input_dim=n**2))
# model.add(Dense(1, kernel_initializer='normal'))
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
return model
n = 15
print("Making the input data using seed 7", file=sys.stderr)
np.random.seed(7)
U = np.random.choice([0, 1], size=(n**2,n))
#U is a random orthogonal matrix
X =[]
Y =[]
# print(U)
for i in tqdm(range(100000)):
I = np.random.choice(n**2, size = n)
# Pick out the random rows and sort the rows of the matrix lexicographically.
A = U[I][np.lexsort(np.rot90(U[I]))]
X.append(A.ravel())
Y.append(det(A))
X = np.array(X)
Y = np.array(Y)
print("Data created")
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(build_fn=baseline_model, epochs=50, batch_size=32, verbose=2)))
pipeline = Pipeline(estimators)
X_train, X_test, y_train, y_test = train_test_split(X, Y,
train_size=0.75, test_size=0.25)
pipeline.fit(X_train, y_train, mlp__validation_split=0.3)
我怎麼能阻止它過度擬合得厲害?
更新1
我嘗試添加更多層和L_2正規化。但是,它幾乎沒有區別。
def baseline_model():
# create model
model = Sequential()
model.add(Dense(n**2, input_dim=n**2, kernel_initializer='glorot_normal', activation='relu'))
model.add(Dense(int((n**2)/2.0), kernel_initializer='glorot_normal', activation='relu', kernel_regularizer=regularizers.l2(0.01)))
model.add(Dense(int((n**2)/2.0), kernel_initializer='glorot_normal', activation='relu', kernel_regularizer=regularizers.l2(0.01)))
model.add(Dense(int((n**2)/2.0), kernel_initializer='glorot_normal', activation='relu', kernel_regularizer=regularizers.l2(0.01)))
model.add(Dense(1, kernel_initializer='glorot_normal'))
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
return model
我增加了曆元數爲100,它與完成:
19s - loss: 788.9504 - val_loss: 18423.2807
Epoch 97/100
24s - loss: 760.2046 - val_loss: 18305.9273
Epoch 98/100
20s - loss: 806.0941 - val_loss: 18174.8706
Epoch 99/100
24s - loss: 780.0487 - val_loss: 18356.7482
Epoch 100/100
27s - loss: 749.2595 - val_loss: 18331.5859
是否可以近似用keras一個矩陣的行列式?
這不是過度擬合,您的模型不適合數據。該模型太簡單了。 –
@MatiasValdenegro我稱之爲過度擬合的原因是虧損持續下降到0,並且validation_loss一直在繼續。增加隱藏層中的節點數量根本無濟於事。接下來你會嘗試什麼? – eleanora
增加隱藏層的數量。使用'glorot'初始化隱藏層。使用'dropout'或'l2 regularizer' – Nain