2017-10-13 63 views
7

作爲一個實驗,我構建了一個keras模型來逼近矩陣的行列式。但是,當我運行它時,每個時代的損失都會下降,驗證損失會上升!例如:如何用keras近似行列式

8s - loss: 7573.9168 - val_loss: 21831.5428 
Epoch 21/50 
8s - loss: 7345.0197 - val_loss: 23594.8540 
Epoch 22/50 
13s - loss: 7087.7454 - val_loss: 24718.3967 
Epoch 23/50 
7s - loss: 6851.8714 - val_loss: 25624.8609 
Epoch 24/50 
6s - loss: 6637.8168 - val_loss: 26616.7835 
Epoch 25/50 
7s - loss: 6446.8898 - val_loss: 28856.9654 
Epoch 26/50 
7s - loss: 6255.7414 - val_loss: 30122.7924 
Epoch 27/50 
7s - loss: 6054.5280 - val_loss: 32458.5306 
Epoch 28/50 

下面是完整的代碼:

import numpy as np 
import sys 
from scipy.stats import pearsonr 
from scipy.linalg import det 
from sklearn.model_selection import train_test_split 
from tqdm import tqdm 
from sklearn.preprocessing import StandardScaler 
from sklearn.pipeline import Pipeline 
import math 
import tensorflow as tf 
from keras.models import Sequential 
from keras.layers import Dense 
from keras.wrappers.scikit_learn import KerasRegressor 
from keras import backend as K 

def baseline_model(): 
# create model 
     model = Sequential() 
     model.add(Dense(200, input_dim=n**2, kernel_initializer='normal', activation='relu')) 
     model.add(Dense(1, input_dim=n**2)) 
     #  model.add(Dense(1, kernel_initializer='normal')) 
     # Compile model 
     model.compile(loss='mean_squared_error', optimizer='adam') 
     return model 


n = 15 

print("Making the input data using seed 7", file=sys.stderr) 
np.random.seed(7) 
U = np.random.choice([0, 1], size=(n**2,n)) 
#U is a random orthogonal matrix 
X =[] 
Y =[] 
# print(U) 
for i in tqdm(range(100000)): 
     I = np.random.choice(n**2, size = n) 
     # Pick out the random rows and sort the rows of the matrix lexicographically. 
     A = U[I][np.lexsort(np.rot90(U[I]))] 
     X.append(A.ravel()) 
     Y.append(det(A)) 

X = np.array(X) 
Y = np.array(Y) 

print("Data created") 

estimators = [] 
estimators.append(('standardize', StandardScaler())) 
estimators.append(('mlp', KerasRegressor(build_fn=baseline_model, epochs=50, batch_size=32, verbose=2))) 
pipeline = Pipeline(estimators) 
X_train, X_test, y_train, y_test = train_test_split(X, Y, 
                train_size=0.75, test_size=0.25) 
pipeline.fit(X_train, y_train, mlp__validation_split=0.3) 

我怎麼能阻止它過度擬合得厲害?


更新1

我嘗試添加更多層和L_2正規化。但是,它幾乎沒有區別。

def baseline_model(): 
# create model 
     model = Sequential() 
     model.add(Dense(n**2, input_dim=n**2, kernel_initializer='glorot_normal', activation='relu')) 
     model.add(Dense(int((n**2)/2.0), kernel_initializer='glorot_normal', activation='relu', kernel_regularizer=regularizers.l2(0.01))) 
     model.add(Dense(int((n**2)/2.0), kernel_initializer='glorot_normal', activation='relu', kernel_regularizer=regularizers.l2(0.01))) 
     model.add(Dense(int((n**2)/2.0), kernel_initializer='glorot_normal', activation='relu', kernel_regularizer=regularizers.l2(0.01))) 
     model.add(Dense(1, kernel_initializer='glorot_normal')) 
     # Compile model 
     model.compile(loss='mean_squared_error', optimizer='adam') 
     return model 

我增加了曆元數爲100,它與完成:

19s - loss: 788.9504 - val_loss: 18423.2807 
Epoch 97/100 
24s - loss: 760.2046 - val_loss: 18305.9273 
Epoch 98/100 
20s - loss: 806.0941 - val_loss: 18174.8706 
Epoch 99/100 
24s - loss: 780.0487 - val_loss: 18356.7482 
Epoch 100/100 
27s - loss: 749.2595 - val_loss: 18331.5859 

是否可以近似用keras一個矩陣的行列式?

+0

這不是過度擬合,您的模型不適合數據。該模型太簡單了。 –

+0

@MatiasValdenegro我稱之爲過度擬合的原因是虧損持續下降到0,並且validation_loss一直在繼續。增加隱藏層中的節點數量根本無濟於事。接下來你會嘗試什麼? – eleanora

+0

增加隱藏層的數量。使用'glorot'初始化隱藏層。使用'dropout'或'l2 regularizer' – Nain

回答

3

我測試了你的代碼,得到了同樣的結果。但讓我們進入矩陣行列式(DET)的基本理解。 DET由n組成!產品,所以在幾層神經網絡中,你不能用n * n權重來近似它。這要求從15開始不會擴展到n = 15的權數!在DET中乘法是1307674368000。

+0

這個我不清楚。 DET當然可以在n^3時間內計算(不是n!)。此外,如果您只是運行數百個時代的keras模型,訓練集上的損失會下降到接近0. – eleanora

+0

事實上,這是一個定義明確的公式,只需+1和-1作爲權重,但涉及很多輸入的乘法運算。不確定這是嘗試簡單的神經網絡的好例子。 –

+0

@eleanora您正在混淆計算複雜度的條款數量。 – denfromufa