2016-12-09 34 views
0

我有一個相當簡單的神經網絡,有1個隱藏層。2層NN權重不更新

但是,權重似乎沒有更新。或者也許他們只是變量值不會改變?

無論哪種方式,我的準確性是0.1,無論我改變學習率還是激活函數,它都不會改變。不知道什麼是錯的。有任何想法嗎 ?

我已經發布了正確的格式化代碼,所以你們可以直接複製粘貼並在本地機器上運行它。

from tensorflow.examples.tutorials.mnist import input_data 
import math 
import numpy as np 
import tensorflow as tf 

# one hot option returns binarized labels. mnist = input_data.read_data_sets('MNIST_data/', one_hot=True) 
# model parameters 
x = tf.placeholder(tf.float32, [784, None],name='x') 
# weights 
W1 = tf.Variable(tf.truncated_normal([25, 784],stddev= 1.0/math.sqrt(784)),name='W') 
W2 = tf.Variable(tf.truncated_normal([25, 25],stddev=1.0/math.sqrt(25)),name='W') 
W3 = tf.Variable(tf.truncated_normal([10, 25],stddev=1.0/math.sqrt(25)),name='W') 

# bias units b1 = tf.Variable(tf.zeros([25,1]),name='b1') 
b2 = tf.Variable(tf.zeros([25,1]),name='b2') 
b3 = tf.Variable(tf.zeros([10,1]),name='b3') 

# NN architecture 
hidden1 = tf.nn.relu(tf.matmul(W1, x,name='hidden1')+b1, name='hidden1_out') 

# hidden2 = tf.nn.sigmoid(tf.matmul(W2, hidden1, name='hidden2')+b2, name='hidden2_out') 

y = tf.matmul(W3, hidden1,name='y') + b3 

y_ = tf.placeholder(tf.float32, [10, None],name='y_') 

# Create the model 
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, y_)) 
train_step = tf.train.GradientDescentOptimizer(2).minimize(cross_entropy) 

sess = tf.Session() 
summary_writer = tf.train.SummaryWriter('log_simple_graph', sess.graph) 
init = tf.global_variables_initializer() 
sess.run(init) 
# Train 
for i in range(1000): 
    batch_xs, batch_ys = mnist.train.next_batch(100) 
    summary =sess.run(train_step, feed_dict={x: np.transpose(batch_xs), y_: np.transpose(batch_ys)}) 
    if summary is not None: 
     summary_writer.add_event(summary) 

# Test trained model 
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1)) 
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 

print(sess.run(accuracy, feed_dict={x: np.transpose(mnist.test.images), y_: np.transpose(mnist.test.labels)})) 
+0

您可以重新格式化代碼,以便清楚哪些行是註釋?我注意到'tf.train.GradientDescentOptimizer'的學習速率是2,非常大。將它降低到0.1或0.01會提高準確度嗎? – mrry

+0

抱歉。當你注意到時正在修復它。現在應該是好的。 – marc

+0

沒有。正如我已經提到的那樣,改變學習速度或激活功能不會做任何事情。準確度爲0,學習率爲0.01。 – marc

回答

1

爲什麼您得到0.1精度的原因始終主要是由於輸入佔位符的尺寸和重量的順序如下它。學習率是另一個因素。如果學習率非常高,那麼梯度將會振盪並且不會達到任何最小值。

Tensorflow將實例(批次)的數量作爲佔位符的第一個索引值。因此,它聲明輸入x

x = tf.placeholder(tf.float32, [784, None],name='x') 

應被聲明爲

x = tf.placeholder(tf.float32, [None, 784],name='x') 

因此代碼,W1應聲明爲

W1 = tf.Variable(tf.truncated_normal([784, 25],stddev= 1.0/math.sqrt(784)),name='W') 

等。即使偏置變量應該是在轉置意義上聲明。 (那怎麼tensorflow需要它:))

例如

b1 = tf.Variable(tf.zeros([25]),name='b1') 
b2 = tf.Variable(tf.zeros([25]),name='b2') 
b3 = tf.Variable(tf.zeros([10]),name='b3') 

我把下面的糾正完整的代碼,供大家參考。我達到了0.9262的精確度:D

from tensorflow.examples.tutorials.mnist import input_data 
import math 
import numpy as np 
import tensorflow as tf 

# one hot option returns binarized labels. 
mnist = input_data.read_data_sets('MNIST_data/', one_hot=True) 
# model parameters 
x = tf.placeholder(tf.float32, [None, 784],name='x') 
# weights 
W1 = tf.Variable(tf.truncated_normal([784, 25],stddev= 1.0/math.sqrt(784)),name='W') 
W2 = tf.Variable(tf.truncated_normal([25, 25],stddev=1.0/math.sqrt(25)),name='W') 
W3 = tf.Variable(tf.truncated_normal([25, 10],stddev=1.0/math.sqrt(25)),name='W') 

# bias units 
b1 = tf.Variable(tf.zeros([25]),name='b1') 
b2 = tf.Variable(tf.zeros([25]),name='b2') 
b3 = tf.Variable(tf.zeros([10]),name='b3') 

# NN architecture 
hidden1 = tf.nn.relu(tf.matmul(x, W1,name='hidden1')+b1, name='hidden1_out') 

# hidden2 = tf.nn.sigmoid(tf.matmul(W2, hidden1, name='hidden2')+b2, name='hidden2_out') 

y = tf.matmul(hidden1, W3,name='y') + b3 

y_ = tf.placeholder(tf.float32, [None, 10],name='y_') 

# Create the model 
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, y_)) 
train_step = tf.train.GradientDescentOptimizer(0.1).minimize(cross_entropy) 

sess = tf.Session() 
summary_writer = tf.train.SummaryWriter('log_simple_graph', sess.graph) 
init = tf.initialize_all_variables() 
sess.run(init) 

for i in range(1000): 
    batch_xs, batch_ys = mnist.train.next_batch(100) 
    summary =sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) 
    if summary is not None: 
     summary_writer.add_event(summary) 

# Test trained model 
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1)) 
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 

print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})) 
+0

好極了!它在TF文件上說什麼? – marc

+0

[link] https://www.tensorflow.org/versions/r0.10/tutorials/mnist/beginners/index.html –

+0

你可以去這條線 –