2017-02-28 55 views
1

我在TensorFlow中爲XOR輸入寫了一個神經網絡。我已經使用了2個單位和softmax分類的隱藏層。輸入是形式< 1,X_1,X_2,零個,一個>的,其中爲什麼我的用於XOR的TensorFlow神經網絡只有0.5左右的精度?

  • 1是偏置
  • X_1和X_2要麼和所有組合{00,01之間的0 1, 10,11}。選定爲正態分佈圍繞0或1
  • 零:是1,如果輸出爲零
  • 之一:是1,如果輸出是一個

精度總是圍繞0.5。出了什麼問題?神經網絡的體系結構是錯誤的,還是有代碼的東西?

import tensorflow as tf 
import numpy as np 
from random import randint 

DEBUG=True 

def init_weights(shape): 
    return tf.Variable(tf.random_normal(shape, stddev=0.01)) 


def model(X, weight_hidden, weight_output): 
    # [1,3] x [3,n_hiddent_units] = [1,n_hiddent_units] 
    hiddern_units_output = tf.nn.sigmoid(tf.matmul(X, weight_hidden)) 

    # [1,n_hiddent_units] x [n_hiddent_units, 2] = [1,2] 
    return hiddern_units_output 
    #return tf.matmul(hiddern_units_output, weight_output) 


def getHiddenLayerOutput(X, weight_hidden): 
    hiddern_units_output = tf.nn.sigmoid(tf.matmul(X, weight_hidden)) 
    return hiddern_units_output 

total_inputs = 100 
zeros = tf.zeros([total_inputs,1]) 
ones = tf.ones([total_inputs,1]) 
around_zeros = tf.random_normal([total_inputs,1], mean=0, stddev=0.01) 
around_ones = tf.random_normal([total_inputs,1], mean=1, stddev=0.01) 

batch_size = 10 
n_hiddent_units = 2 
X = tf.placeholder("float", [None, 3]) 
Y = tf.placeholder("float", [None, 2]) 

weight_hidden = init_weights([3, n_hiddent_units]) 
weight_output = init_weights([n_hiddent_units, 2]) 

hiddern_units_output = getHiddenLayerOutput(X, weight_hidden) 
py_x = model(X, weight_hidden, weight_output) 

#cost = tf.square(Y - py_x) 
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=py_x, labels=Y)) 
train_op = tf.train.GradientDescentOptimizer(0.05).minimize(cost) 

with tf.Session() as sess: 
    tf.global_variables_initializer().run() 

    trX_0_0 = sess.run(tf.concat([ones, around_zeros, around_zeros, ones, zeros], axis=1)) 
    trX_0_1 = sess.run(tf.concat([ones, around_zeros, around_ones, zeros, ones], axis=1)) 
    trX_1_0 = sess.run(tf.concat([ones, around_ones, around_zeros, zeros, ones], axis=1)) 
    trX_1_1 = sess.run(tf.concat([ones, around_ones, around_ones, ones, zeros], axis=1)) 
    trX = sess.run(tf.concat([trX_0_0, trX_0_1, trX_1_0, trX_1_1], axis=0)) 
    trX = sess.run(tf.random_shuffle(trX)) 
    print(trX) 

    for i in range(10): 
     for start, end in zip(range(0, len(trX), batch_size), range(batch_size, len(trX) + 1, batch_size)): 
      trY = tf.identity(trX[start:end,3:5]) 
      trY = sess.run(tf.reshape(trY,[batch_size, 2])) 
      sess.run(train_op, feed_dict={ X: trX[start:end,0:3], Y: trY }) 

     start_index = randint(0, (total_inputs*4)-batch_size) 
     y_0 = sess.run(py_x, feed_dict={X: trX[start_index:start_index+batch_size,0:3]}) 
     print("iteration :",i, " accuracy :", np.mean(np.absolute(trX[start_index:start_index+batch_size,3:5]-y_0)),"\n") 

檢查更新的代碼

+2

您現在應該知道,您無法鏈接到本網站的外部代碼 - 您必須將其置於問題本身中。 –

+0

感謝您編輯@RandomDavis,它現在更具可讀性;) –

+0

爲您的網絡添加偏見 – lejlot

回答

0

的問題是與隨機分配的權重的評論部分。 Here是修改後的版本,經過一系列的追蹤和錯誤獲得。

相關問題