我創建具有與10個單位的各使用RELU激活和Xavier初始化爲權重2個隱藏層的TensorFlow神經網絡。輸出層具有1個單元,使用S形的激活函數進行分類,是否認爲對鈦的乘客存活基於輸入輸出特性二元分類(0或1)。TensorFlow:神經網絡精度總是100%上火車和測試設置
(省略唯一代碼是load_data函數填充在後面的程序中使用的變量X_train,Y_train,X_test,Y_test)
參數
# Hyperparams
learning_rate = 0.001
lay_dims = [10,10, 1]
# Other params
m = X_train.shape[1]
n_x = X_train.shape[0]
n_y = Y_train.shape[0]
輸入
X = tf.placeholder(tf.float32, shape=[X_train.shape[0], None], name="X")
norm = tf.nn.l2_normalize(X, 0) # normalize inputs
Y = tf.placeholder(tf.float32, shape=[Y_train.shape[0], None], name="Y")
初始化權重&偏見
W1 = tf.get_variable("W1", [lay_dims[0],n_x], initializer=tf.contrib.layers.xavier_initializer())
b1 = tf.get_variable("b1", [lay_dims[0],1], initializer=tf.zeros_initializer())
W2 = tf.get_variable("W2", [lay_dims[1],lay_dims[0]], initializer=tf.contrib.layers.xavier_initializer())
b2 = tf.get_variable("b2", [lay_dims[1],1], initializer=tf.zeros_initializer())
W3 = tf.get_variable("W3", [lay_dims[2],lay_dims[1]], initializer=tf.contrib.layers.xavier_initializer())
b3 = tf.get_variable("b3", [lay_dims[2],1], initializer=tf.zeros_initializer())
正向支柱
Z1 = tf.add(tf.matmul(W1,X), b1)
A1 = tf.nn.relu(Z1)
Z2 = tf.add(tf.matmul(W2,A1), b2)
A2 = tf.nn.relu(Z2)
Y_hat = tf.add(tf.matmul(W3,A2), b3)
BackProp
cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=tf.transpose(Y_hat), labels=tf.transpose(Y)))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
會議
# Initialize
init = tf.global_variables_initializer()
with tf.Session() as sess:
# Initialize
sess.run(init)
# Normalize Inputs
sess.run(norm, feed_dict={X:X_train, Y:Y_train})
# Forward/Backprob and update weights
for i in range(10000):
c, _ = sess.run([cost, optimizer], feed_dict={X:X_train, Y:Y_train})
if i % 100 == 0:
print(c)
correct_prediction = tf.equal(tf.argmax(Y_hat), tf.argmax(Y))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print("Training Set:", sess.run(accuracy, feed_dict={X: X_train, Y: Y_train}))
print("Testing Set:", sess.run(accuracy, feed_dict={X: X_test, Y: Y_test}))
跑跑步訓練萬個時代後,成本就會每次下來,它表明learning_rate是好吧,成本函數看起來很正常。然而,在訓練之後,我所有的Y_hat值(對訓練集的預測)都是1(預測乘客倖存下來)。所以基本上,對於每個訓練示例,預測只輸出y = 1。
此外,當我在Y_hat運行tf.argmax,結果是全0的矩陣。當tf.argmax應用於Y(地面實況標籤)時,同樣的事情發生了,這是奇怪的,因爲Y由訓練樣例的所有正確標籤組成。
任何幫助,非常感謝。謝謝。
我不明白,「看來,我所有的數據從Y_hat來的是1或接近1的時間越長我訓練模型和所有我的Y_hat和Y(其中有0地面實況標籤argmax的值或1)出來爲0.「這句話很混亂。你能改說嗎? – Lan
剛做了編輯。那個更好嗎? – IanTimmis