2017-04-26 75 views
1

我寫一個程序tensorflow處理Kaggle的數位識別problem.Program可以正常運行,但訓練精度總是很低,約10%,如下列:TensorFlow - 訓練精度在MNIST數據沒有改善

step 0, training accuracy 0.11 
step 100, training accuracy 0.13 
step 200, training accuracy 0.21 
step 300, training accuracy 0.12 
step 400, training accuracy 0.07 
step 500, training accuracy 0.08 
step 600, training accuracy 0.15 
step 700, training accuracy 0.05 
step 800, training accuracy 0.08 
step 900, training accuracy 0.12 
step 1000, training accuracy 0.05 
step 1100, training accuracy 0.09 
step 1200, training accuracy 0.12 
step 1300, training accuracy 0.1 
step 1400, training accuracy 0.08 
step 1500, training accuracy 0.11 
step 1600, training accuracy 0.17 
step 1700, training accuracy 0.13 
step 1800, training accuracy 0.11 
step 1900, training accuracy 0.13 
step 2000, training accuracy 0.07 
…… 

以下是我的代碼:

def weight_variable(shape): 
    initial = tf.truncated_normal(shape, stddev=0.1) 
    return tf.Variable(initial) 

def bias_variable(shape): 
    initial = tf.constant(0.1, shape=shape) 
    return tf.Variable(initial) 

def conv2d(x, w): 
    return tf.nn.conv2d(x, w, strides=[1, 1, 1, 1], padding='SAME') 

def max_pool_2x2(x): 
    # ksize = [batch, heigh, width, channels], strides=[batch, stride, stride, channels] 
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') 

x = tf.placeholder(tf.float32, [None, 784])  
y_ = tf.placeholder(tf.float32, [None, 10])  
keep_prob = tf.placeholder(tf.float32) 

x_image = tf.placeholder(tf.float32, [None, 28, 28, 1]) 

w_conv1 = weight_variable([5, 5, 1, 32])  
b_conv1 = bias_variable([32]) 
h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1) 
h_pool1 = max_pool_2x2(h_conv1) 

w_conv2 = weight_variable([5, 5, 32, 64])  
b_conv2 = bias_variable([64]) 
h_conv2 = tf.nn.relu(conv2d(h_pool1, w_conv2) + b_conv2) 
h_pool2 = max_pool_2x2(h_conv2) 

w_fc1 = weight_variable([7 * 7 * 64, 1024]) 
b_fc1 = bias_variable([1024]) 
h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])  
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1) 
# dropout 
keep_prob = tf.placeholder(tf.float32)  
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) 
# softmax 
w_fc2 = weight_variable([1024, 10]) 
b_fc2 = bias_variable([10]) 
y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, w_fc2) + b_fc2) 

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_conv), reduction_indices=[1])) 
train_step = tf.train.AdamOptimizer(10e-4).minimize(cross_entropy) 

correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1)) 
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 

def get_batch(i, size, train, label): 
    startIndex = (i * size) % 42000 
    endIndex = startIndex + size 
    batch_X = train[startIndex : endIndex] 
    batch_Y = label[startIndex : endIndex] 
    return batch_X, batch_Y 


data = pd.read_csv('train.csv') 
train_data = data.drop(['label'], axis=1) 
train_data = train_data.values.astype(dtype=np.float32) 
train_data = train_data.reshape(42000, 28, 28, 1) 

label_data = data['label'].tolist() 
label_data = tf.one_hot(label_data, depth=10) 
label_data = tf.Session().run(label_data).astype(dtype=np.float64) 


batch_size = 100        
tf.global_variables_initializer().run() 

for i in range(20000): 
    batch_x, batch_y = get_batch(i, batch_size, train_data, label_data) 
    if i % 100 == 0: 
     train_accuracy = accuracy.eval(feed_dict={x_image: batch_x, y_: batch_y, keep_prob: 1.0}) 
     print("step %d, training accuracy %g" % (i, train_accuracy)) 
    train_step.run(feed_dict={x_image: batch_x, y_: batch_y, keep_prob: 0.9}) 

我不知道什麼地方錯了我的計劃。

+0

如果答案解決了您的問題,請接受它 - 請參閱[我應該怎麼做當有人回答我的問題?](謝謝 – desertnaut

回答

0

我建議你改變你的bias_variable功能 - 不知道如何tf.Variable(tf.constant)的行爲,再加上我們平時在零初始化的偏見,而不是0.1:

def bias_variable(shape): 
    return tf.zeros((shape), dtype = tf.float32) 

如果這沒有幫助,請您初始化重量與stddev=0.01

+0

謝謝你的非常重要!我將權重和偏差的stddev改爲0.01,程序在約3000步之前正常運行,精度約爲0.99。但是,在步驟3000之後,精確度突然變爲0.1左右。非常奇怪。 – Lixudong

+0

@Lixudong這是另一個不同的問題 - 請接受這個答案,並打開一個新帖子 – desertnaut

+0

...與您更新的代碼和結果 – desertnaut