2016-12-24 155 views
1

我有一個Keras CIFAR10的工作示例,我試圖將其轉換爲TF。我對Python和TF相當陌生。我從here調整了大量材料,但我保留了加載和準備數據集的Keras函數。這也確保了數據集是相同的。CIFAR10示例:從Keras到Tensorflow

問題可能出在我準備批次的方式上。您在TF版本中看到的評論代碼沒有任何區別,只是它很慢。該部分代碼應該將batch_size圖像和標籤從數據集複製到epoch_x和epoch_y。

無論如何,問題是TF版本的準確性卡在0.1(隨機輸出),即使損失值隨着時間而降低。從以前的experience,這有時是由於數據集問題。

下面兩個例子的代碼。如果您可以考慮TF版本的任何問題,請告訴我。非常感謝你提前。

Keras:

from __future__ import print_function 
from keras.datasets import cifar10 
from keras.preprocessing.image import ImageDataGenerator 
from keras.models import Sequential 
from keras.layers import Dense, Dropout, Activation, Flatten 
from keras.layers import Convolution2D, MaxPooling2D 
from keras.optimizers import SGD, Adam 
from keras.utils import np_utils 
import numpy as np 

#seed = 7 
#np.random.seed(seed) 

batch_size = 50 
nb_classes = 10 
nb_epoch = 200 
data_augmentation = False 

# input image dimensions 
img_rows, img_cols = 32, 32 
# the CIFAR10 images are RGB 
img_channels = 3 

# the data, shuffled and split between train and test sets 
(X_train, y_train), (X_test, y_test) = cifar10.load_data() 
print('X_train shape:', X_train.shape) 
print(X_train.shape[0], 'train samples') 
print(X_test.shape[0], 'test samples') 

# convert class vectors to binary class matrices 
Y_train = np_utils.to_categorical(y_train, nb_classes) 
Y_test = np_utils.to_categorical(y_test, nb_classes) 

model = Sequential() 

model.add(Convolution2D(32, 3, 3, border_mode='same', 
         input_shape=X_train.shape[1:])) 
model.add(Activation('relu')) 
model.add(Convolution2D(32, 3, 3, border_mode='same')) 
model.add(Activation('relu')) 
model.add(MaxPooling2D(pool_size=(2, 2))) 
model.add(Dropout(0.25)) 

model.add(Convolution2D(64, 3, 3, border_mode='same')) 
model.add(Activation('relu')) 
model.add(Convolution2D(64, 3, 3, border_mode='same')) 
model.add(Activation('relu')) 
model.add(MaxPooling2D(pool_size=(2, 2))) 
model.add(Dropout(0.25)) 

model.add(Flatten()) 
model.add(Dense(512)) 
model.add(Activation('relu')) 
model.add(Dropout(0.5)) 
model.add(Dense(nb_classes)) 
model.add(Activation('softmax')) 

# let's train the model using SGD + momentum (how original). 

#sgd = SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True) 
sgd= Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0) 
model.compile(loss='categorical_crossentropy', 
       optimizer=sgd, 
       metrics=['accuracy']) 

X_train = X_train.astype('float32') 
X_test = X_test.astype('float32') 
X_train /= 255 
X_test /= 255 

if not data_augmentation: 
    print('Not using data augmentation.') 
    model.fit(X_train, Y_train, 
       batch_size=batch_size, 
       nb_epoch=nb_epoch, 
       validation_data=(X_test, Y_test), 
       shuffle=True) 

else: 
    print('Using real-time data augmentation.') 

    # this will do preprocessing and realtime data augmentation 
    datagen = ImageDataGenerator(
     featurewise_center=False, # set input mean to 0 over the dataset 
     samplewise_center=False, # set each sample mean to 0 
     featurewise_std_normalization=False, # divide inputs by std of the dataset 
     samplewise_std_normalization=False, # divide each input by its std 
     zca_whitening=False, # apply ZCA whitening 
     rotation_range=0, # randomly rotate images in the range (degrees, 0 to 180) 
     width_shift_range=0.1, # randomly shift images horizontally (fraction of total width) 
     height_shift_range=0.1, # randomly shift images vertically (fraction of total height) 
     horizontal_flip=True, # randomly flip images 
     vertical_flip=False) # randomly flip images 

    # compute quantities required for featurewise normalization 
    # (std, mean, and principal components if ZCA whitening is applied) 
    datagen.fit(X_train) 

    # fit the model on the batches generated by datagen.flow() 
    model.fit_generator(datagen.flow(X_train, Y_train, 
         batch_size=batch_size), 
         samples_per_epoch=X_train.shape[0], 
         nb_epoch=nb_epoch, 
validation_data=(X_test, Y_test)) 

model.save('model3.h5') 

Tensorflow:

from __future__ import print_function 
from keras.datasets import cifar10 
from keras.utils import np_utils 
import tensorflow as tf 
import numpy as np 

# input image dimensions 
img_rows, img_cols = 32, 32 
# the CIFAR10 images are RGB 
img_channels = 3 
batch_size = 50 
nb_classes = 10 
nb_epoch = 200 

#seed = 7 
#np.random.seed(seed) 

epoch_x=np.zeros((batch_size,img_rows, img_cols,img_channels)).astype('float32') 
print('epoch_x shape:', epoch_x.shape) 
epoch_y=np.zeros((batch_size,nb_classes)).astype('float32') 
print('epoch_y shape:', epoch_y.shape) 

num_train_examples=50000 
(X_train, y_train), (X_test, y_test) = cifar10.load_data() 
print('X_train shape:', X_train.shape) 
print('X_train shape:', X_train.shape[1:]) 
print(X_train.shape[0], 'train samples') 
print(X_test.shape[0], 'test samples') 
X_train = X_train.astype('float32') 
X_test = X_test.astype('float32') 
X_train /= 255 
X_test /= 255 

# convert class vectors to binary class matrices 
Y_train = np_utils.to_categorical(y_train, nb_classes) 
print('Y_train shape:', Y_train.shape) 
Y_test = np_utils.to_categorical(y_test, nb_classes) 

def conv2d(x, W): 
    return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding='SAME') 

def maxpool2d(x): 
    #      size of window   movement of window 
    return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME') 

# Define network 
# TF graph 
img = tf.placeholder(tf.float32, shape=(None,img_rows, img_cols,img_channels)) 
labels = tf.placeholder(tf.float32, shape=(None, nb_classes)) 

weights = {'W_conv0':tf.Variable(tf.random_normal([3,3,3,32])), 
      'W_conv1':tf.Variable(tf.random_normal([3,3,32,32])), 
      'W_conv2':tf.Variable(tf.random_normal([3,3,32,64])), 
      'W_conv3':tf.Variable(tf.random_normal([3,3,64,64])), 
      'W_fc':tf.Variable(tf.random_normal([8*8*64,512])), 
      'out':tf.Variable(tf.random_normal([512, nb_classes]))} 

biases = {'b_conv0':tf.Variable(tf.random_normal([32])), 
      'b_conv1':tf.Variable(tf.random_normal([32])), 
      'b_conv2':tf.Variable(tf.random_normal([64])), 
      'b_conv3':tf.Variable(tf.random_normal([64])), 
      'b_fc':tf.Variable(tf.random_normal([512])), 
      'out':tf.Variable(tf.random_normal([nb_classes]))} 

conv0 = conv2d(img, weights['W_conv0']) + biases['b_conv0'] 
conv0 = tf.nn.relu(conv0) 

conv1 = conv2d(conv0, weights['W_conv1']) + biases['b_conv1'] 
conv1 = tf.nn.relu(conv1) 
conv1 = maxpool2d(conv1) 
conv1 = tf.nn.dropout(conv1,0.25) 

conv2 = conv2d(conv1, weights['W_conv2']) + biases['b_conv2'] 
conv2 = tf.nn.relu(conv2) 

conv3 = conv2d(conv2, weights['W_conv3']) + biases['b_conv3'] 
conv3 = tf.nn.relu(conv3) 
conv3 = maxpool2d(conv3) 
conv3 = tf.nn.dropout(conv3,0.25) 

fc = tf.reshape(conv3,[-1, 8*8*64]) 
fc = tf.matmul(fc, weights['W_fc'])+biases['b_fc'] 
fc = tf.nn.relu(fc) 
fc = tf.nn.dropout(fc,0.5) 

prediction = tf.matmul(fc, weights['out'])+biases['out'] 

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(prediction,labels)) 
optimizer = tf.train.AdamOptimizer().minimize(cost) 
#optimizer = tf.train.AdamOptimizer(learning_rate=1e-3,epsilon=0.1).minimize(cost) 
#optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.0001).minimize(cost) 

with tf.Session() as sess: 
    sess.run(tf.initialize_all_variables()) 
    for epoch in range(nb_epoch): 
     epoch_loss = 0 
     for i in range(int(num_train_examples/batch_size)): 
      # batch = mnist_data.train.next_batch(batch_size) 
      for j in range(batch_size): 
       epoch_x[j]=X_train[i*batch_size+j] 
       epoch_y[j]=Y_train[i*batch_size+j] 
##   for j in range(batch_size): 
##    for row in range(img_rows): 
##     for col in range(img_cols): 
##      for ch in range(img_channels): 
##       epoch_x[j][row][col][ch]=X_train[i*batch_size+j][row][col][ch] 
##   for j in range(batch_size): 
##    for t in range(nb_classes): 
##     epoch_y[j][t]=Y_train[i*batch_size+j][t] 
      _, c = sess.run([optimizer, cost],feed_dict={img: epoch_x,labels: epoch_y}) 
      epoch_loss += c 
     print('Epoch', epoch, 'completed out of',nb_epoch,'loss:',epoch_loss) 
     correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(labels, 1)) 
     accuracy = tf.reduce_mean(tf.cast(correct, 'float32')) 
     print('Accuracy:',accuracy.eval({img: X_test,labels: Y_test})) 
+0

快速查看CIFAR10數據集顯示它已經相當混亂。所以,我會傾向於排除這種可能的原因。在任何情況下,即使設置shuffle = False,Keras示例也會快速收斂。 – ozne

+0

我注意到網絡「預測」的輸出不通過非線性。這是解釋[這裏](http://stackoverflow.com/questions/34240703/difference-between-tensorflow-tf-nn-softmax-and-tf-nn-softmax-cross-entropy-with)和softmax包括在成本。但是,如果您查看「正確= ...」的準確性檢查,則會將標籤與網絡的**線性**輸出進行比較。這看起來不正確。是嗎 ?但是,當我添加'soft = tf.nn.softmax(預測)'並更改爲'correct = tf.equal(tf.argmax(soft,1),tf.argmax(labels,1))'時,沒有任何變化,仍然堅持在0.1。 – ozne

+0

無奈之下,我編寫了一個簡單的散列函數,將數據集X_train以及epoch_x中的每個圖像散列到優化器中,構建兩個包含圖像索引號,散列和標籤的ext文件。這兩個文件完全匹配似乎表明數據集不是問題。嘆息..... – ozne

回答

2

後慘勞苦的日子裏,我發現了這個令人難以置信的差異的原因:重初始化!顯然,默認情況下,Keras使用0到0.05之間的均勻分佈。在改變TF代碼來做同樣的事情之後,它會變得更好,很好地上升,不再停留在0.1。

+0

順便說一句,即使在更改init時,它是一個比以前好多了,Keras腳本(帶有相同的TFbackend)在測試精度爲〜.8和純粹的TF代碼爲〜0.54之前有了很大的提高。這裏還有一些缺失........ – ozne

相關問題