2016-04-29 100 views
3

我在IRIS數據上比較了Keras神經網絡與簡單Logistic Regression from Scikit-learn。正如this post所建議的那樣,我預計Keras-NN的性能會更好。如何使Keras神經網絡在虹膜數據上優於Logistic迴歸

但是爲什麼通過模擬那裏的代碼,Keras-NN的結果低於 邏輯迴歸?

import seaborn as sns 
import numpy as np 
from sklearn.cross_validation import train_test_split 
from sklearn.linear_model import LogisticRegressionCV 
from keras.models import Sequential 
from keras.layers.core import Dense, Activation 
from keras.utils import np_utils 

# Prepare data 
iris = sns.load_dataset("iris") 
X = iris.values[:, 0:4] 
y = iris.values[:, 4] 

# Make test and train set 
train_X, test_X, train_y, test_y = train_test_split(X, y, train_size=0.5, random_state=0) 

################################ 
# Evaluate Logistic Regression 
################################ 
lr = LogisticRegressionCV() 
lr.fit(train_X, train_y) 
pred_y = lr.predict(test_X) 
print("Test fraction correct (LR-Accuracy) = {:.2f}".format(lr.score(test_X, test_y))) 



################################ 
# Evaluate Keras Neural Network 
################################ 

# Make ONE-HOT 
def one_hot_encode_object_array(arr): 
    '''One hot encode a numpy array of objects (e.g. strings)''' 
    uniques, ids = np.unique(arr, return_inverse=True) 
    return np_utils.to_categorical(ids, len(uniques)) 


train_y_ohe = one_hot_encode_object_array(train_y) 
test_y_ohe = one_hot_encode_object_array(test_y) 

model = Sequential() 
model.add(Dense(16, input_shape=(4,))) 
model.add(Activation('sigmoid')) 
model.add(Dense(3)) 
model.add(Activation('softmax')) 
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam') 

# Actual modelling 
model.fit(train_X, train_y_ohe, verbose=0, batch_size=1) 
score, accuracy = model.evaluate(test_X, test_y_ohe, batch_size=16, verbose=0) 
print("Test fraction correct (NN-Score) = {:.2f}".format(score)) 
print("Test fraction correct (NN-Accuracy) = {:.2f}".format(accuracy)) 

我使用這個版本Keras

In [2]: keras.__version__ 
Out[2]: '1.0.1' 

結果表明:

Test fraction correct (LR-Accuracy) = 0.83 
Test fraction correct (NN-Score) = 0.75 
Test fraction correct (NN-Accuracy) = 0.60 

that post,Keras的精度應爲0.99。什麼地方出了錯?

回答

1

在本月(2016年4月)剛剛發佈的Keras版本1中,默認的時期數量從Keras版本0中的100減少到10。嘗試:

model.fit(train_X, train_y_ohe, verbose=0, batch_size=1, nb_epoch=100) 
2

你的神經網絡是相當簡單的。嘗試通過添加更多神經元和圖層來創建深度神經網絡。此外,擴展功能也很重要。嘗試glorot_uniform初始值設定項。最後但並非最不重要的是,增加時代,看看每個時代的損失是否在減少。

所以在這裏你去:

model = Sequential() 
model.add(Dense(input_dim=4, output_dim=512, init='glorot_uniform')) 
model.add(PReLU(input_shape=(512,))) 
model.add(BatchNormalization((512,))) 
model.add(Dropout(0.5)) 

model.add(Dense(input_dim=512, output_dim=512, init='glorot_uniform')) 
model.add(PReLU(input_shape=(512,))) 
model.add(BatchNormalization((512,))) 
model.add(Dropout(0.5)) 

model.add(Dense(input_dim=512, output_dim=512, init='glorot_uniform')) 
model.add(PReLU(input_shape=(512,))) 
model.add(BatchNormalization((512,))) 
model.add(Dropout(0.5)) 

model.add(Dense(input_dim=512, output_dim=512, init='glorot_uniform')) 
model.add(PReLU(input_shape=(512,))) 
model.add(BatchNormalization((512,))) 
model.add(Dropout(0.5)) 

model.add(Dense(input_dim=512, output_dim=512, init='glorot_uniform')) 
model.add(PReLU(input_shape=(512,))) 
model.add(BatchNormalization((512,))) 
model.add(Dropout(0.5)) 

model.add(Dense(input_dim=512, output_dim=3, init='glorot_uniform')) 
model.add(Activation('softmax')) 

這0.97左右達到第120劃時代