我有一些麻煩試圖建立一個多層感知器用於使用張量流的二進制分類。用張量流建立MLP用於二進制分類
我有一個非常大的數據集(大約1,5 * 10^6個例子),每個都有一個二進制(0/1)標籤和100個特徵。我需要做的是建立一個簡單的MLP,然後嘗試改變學習率和初始化模式來記錄結果(這是一個任務)。 雖然我得到了奇怪的結果,因爲我的MLP似乎早就陷入了一個低但不是很高的成本,並且從來沒有下過。由於學習速率相當低,所以成本幾乎立即上漲。我不知道是否問題在於我如何構建MLP(我做了幾次嘗試,發佈最後一個代碼)還是如果我在tensorflow實現中丟失了某些東西。
CODE
import tensorflow as tf
import numpy as np
import scipy.io
# Import and transform dataset
print("Importing dataset.")
dataset = scipy.io.mmread('tfidf_tsvd.mtx')
with open('labels.txt') as f:
all_labels = f.readlines()
all_labels = np.asarray(all_labels)
all_labels = all_labels.reshape((1498271,1))
# Split dataset into training (66%) and test (33%) set
training_set = dataset[0:1000000]
training_labels = all_labels[0:1000000]
test_set = dataset[1000000:1498272]
test_labels = all_labels[1000000:1498272]
print("Dataset ready.")
# Parameters
learning_rate = 0.01 #argv
mini_batch_size = 100
training_epochs = 10000
display_step = 500
# Network Parameters
n_hidden_1 = 64 # 1st hidden layer of neurons
n_hidden_2 = 32 # 2nd hidden layer of neurons
n_hidden_3 = 16 # 3rd hidden layer of neurons
n_input = 100 # number of features after LSA
# Tensorflow Graph input
x = tf.placeholder(tf.float64, shape=[None, n_input], name="x-data")
y = tf.placeholder(tf.float64, shape=[None, 1], name="y-labels")
print("Creating model.")
# Create model
def multilayer_perceptron(x, weights):
# First hidden layer with SIGMOID activation
layer_1 = tf.matmul(x, weights['h1'])
layer_1 = tf.nn.sigmoid(layer_1)
# Second hidden layer with SIGMOID activation
layer_2 = tf.matmul(layer_1, weights['h2'])
layer_2 = tf.nn.sigmoid(layer_2)
# Third hidden layer with SIGMOID activation
layer_3 = tf.matmul(layer_2, weights['h3'])
layer_3 = tf.nn.sigmoid(layer_3)
# Output layer with SIGMOID activation
out_layer = tf.matmul(layer_2, weights['out'])
return out_layer
# Layer weights, should change them to see results
weights = {
'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1], dtype=np.float64)),
'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2], dtype=np.float64)),
'h3': tf.Variable(tf.random_normal([n_hidden_2, n_hidden_3],dtype=np.float64)),
'out': tf.Variable(tf.random_normal([n_hidden_2, 1], dtype=np.float64))
}
# Construct model
pred = multilayer_perceptron(x, weights)
# Define loss and optimizer
cost = tf.nn.l2_loss(pred-y,name="squared_error_cost")
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
# Initializing the variables
init = tf.initialize_all_variables()
print("Model ready.")
# Launch the graph
with tf.Session() as sess:
sess.run(init)
print("Starting Training.")
# Training cycle
for epoch in range(training_epochs):
#avg_cost = 0.
# minibatch loading
minibatch_x = training_set[mini_batch_size*epoch:mini_batch_size*(epoch+1)]
minibatch_y = training_labels[mini_batch_size*epoch:mini_batch_size*(epoch+1)]
# Run optimization op (backprop) and cost op
_, c = sess.run([optimizer, cost], feed_dict={x: minibatch_x, y: minibatch_y})
# Compute average loss
avg_cost = c/(minibatch_x.shape[0])
# Display logs per epoch
if (epoch) % display_step == 0:
print("Epoch:", '%05d' % (epoch), "Training error=", "{:.9f}".format(avg_cost))
print("Optimization Finished!")
# Test model
# Calculate accuracy
test_error = tf.nn.l2_loss(pred-y,name="squared_error_test_cost")/test_set.shape[0]
print("Test Error:", test_error.eval({x: test_set, y: test_labels}))
輸出
python nn.py
Importing dataset.
Dataset ready.
Creating model.
Model ready.
Starting Training.
Epoch: 00000 Training error= 0.331874878
Epoch: 00500 Training error= 0.121587482
Epoch: 01000 Training error= 0.112870921
Epoch: 01500 Training error= 0.110293652
Epoch: 02000 Training error= 0.122655269
Epoch: 02500 Training error= 0.124971940
Epoch: 03000 Training error= 0.125407845
Epoch: 03500 Training error= 0.131942481
Epoch: 04000 Training error= 0.121696954
Epoch: 04500 Training error= 0.116669835
Epoch: 05000 Training error= 0.129558477
Epoch: 05500 Training error= 0.122952110
Epoch: 06000 Training error= 0.124655344
Epoch: 06500 Training error= 0.119827300
Epoch: 07000 Training error= 0.125183779
Epoch: 07500 Training error= 0.156429254
Epoch: 08000 Training error= 0.085632880
Epoch: 08500 Training error= 0.133913128
Epoch: 09000 Training error= 0.114762624
Epoch: 09500 Training error= 0.115107805
Optimization Finished!
Test Error: 0.116647016708
這是MMN建議
weights = {
'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1], stddev=0, dtype=np.float64)),
'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2], stddev=0.01, dtype=np.float64)),
'h3': tf.Variable(tf.random_normal([n_hidden_2, n_hidden_3], stddev=0.01, dtype=np.float64)),
'out': tf.Variable(tf.random_normal([n_hidden_2, 1], dtype=np.float64))
}
這是輸出
Epoch: 00000 Training error= 0.107566668
Epoch: 00500 Training error= 0.289380907
Epoch: 01000 Training error= 0.339091784
Epoch: 01500 Training error= 0.358559815
Epoch: 02000 Training error= 0.122639698
Epoch: 02500 Training error= 0.125160135
Epoch: 03000 Training error= 0.126219718
Epoch: 03500 Training error= 0.132500418
Epoch: 04000 Training error= 0.121795254
Epoch: 04500 Training error= 0.116499476
Epoch: 05000 Training error= 0.124532673
Epoch: 05500 Training error= 0.124484790
Epoch: 06000 Training error= 0.118491177
Epoch: 06500 Training error= 0.119977633
Epoch: 07000 Training error= 0.127532511
Epoch: 07500 Training error= 0.159053519
Epoch: 08000 Training error= 0.083876224
Epoch: 08500 Training error= 0.131488483
Epoch: 09000 Training error= 0.123161189
Epoch: 09500 Training error= 0.125011362
Optimization Finished!
Test Error: 0.129284643093
相連的第三隱藏層,由於MMN
有在我的代碼錯誤,我有兩個隱含層,而不是三個。我糾正這樣做的:
'out': tf.Variable(tf.random_normal([n_hidden_3, 1], dtype=np.float64))
和
out_layer = tf.matmul(layer_3, weights['out'])
我回到了STDDEV舊值雖然,因爲它似乎導致成本函數的變動小。
輸出仍然困擾
Epoch: 00000 Training error= 0.477673073
Epoch: 00500 Training error= 0.121848744
Epoch: 01000 Training error= 0.112854530
Epoch: 01500 Training error= 0.110597624
Epoch: 02000 Training error= 0.122603499
Epoch: 02500 Training error= 0.125051472
Epoch: 03000 Training error= 0.125400717
Epoch: 03500 Training error= 0.131999354
Epoch: 04000 Training error= 0.121850889
Epoch: 04500 Training error= 0.116551533
Epoch: 05000 Training error= 0.129749704
Epoch: 05500 Training error= 0.124600464
Epoch: 06000 Training error= 0.121600218
Epoch: 06500 Training error= 0.121249676
Epoch: 07000 Training error= 0.132656938
Epoch: 07500 Training error= 0.161801757
Epoch: 08000 Training error= 0.084197352
Epoch: 08500 Training error= 0.132197409
Epoch: 09000 Training error= 0.123249055
Epoch: 09500 Training error= 0.126602369
Optimization Finished!
Test Error: 0.129230736355
兩個更感謝史蒂芬 變化,使史蒂芬建議改變乙狀結腸激活功能與RELU,所以我試過了。同時,我注意到我沒有爲輸出節點設置激活函數,所以我也這樣做了(應該很容易看出我改變了什麼)。
Starting Training.
Epoch: 00000 Training error= 293.245977809
Epoch: 00500 Training error= 0.290000000
Epoch: 01000 Training error= 0.340000000
Epoch: 01500 Training error= 0.360000000
Epoch: 02000 Training error= 0.285000000
Epoch: 02500 Training error= 0.250000000
Epoch: 03000 Training error= 0.245000000
Epoch: 03500 Training error= 0.260000000
Epoch: 04000 Training error= 0.290000000
Epoch: 04500 Training error= 0.315000000
Epoch: 05000 Training error= 0.285000000
Epoch: 05500 Training error= 0.265000000
Epoch: 06000 Training error= 0.340000000
Epoch: 06500 Training error= 0.180000000
Epoch: 07000 Training error= 0.370000000
Epoch: 07500 Training error= 0.175000000
Epoch: 08000 Training error= 0.105000000
Epoch: 08500 Training error= 0.295000000
Epoch: 09000 Training error= 0.280000000
Epoch: 09500 Training error= 0.285000000
Optimization Finished!
Test Error: 0.220196439287
這是它與每個節點上乙狀結腸激活功能呢,輸出包括
Epoch: 00000 Training error= 0.110878121
Epoch: 00500 Training error= 0.119393080
Epoch: 01000 Training error= 0.109229532
Epoch: 01500 Training error= 0.100436962
Epoch: 02000 Training error= 0.113160662
Epoch: 02500 Training error= 0.114200962
Epoch: 03000 Training error= 0.109777990
Epoch: 03500 Training error= 0.108218725
Epoch: 04000 Training error= 0.103001394
Epoch: 04500 Training error= 0.084145737
Epoch: 05000 Training error= 0.119173495
Epoch: 05500 Training error= 0.095796251
Epoch: 06000 Training error= 0.093336573
Epoch: 06500 Training error= 0.085062860
Epoch: 07000 Training error= 0.104251661
Epoch: 07500 Training error= 0.105910949
Epoch: 08000 Training error= 0.090347288
Epoch: 08500 Training error= 0.124480612
Epoch: 09000 Training error= 0.109250224
Epoch: 09500 Training error= 0.100245836
Optimization Finished!
Test Error: 0.110234139674
我發現這些數字很奇怪,在第一種情況下,它是停留在一個較高成本比乙狀結腸,儘管乙狀結腸應該很早就飽和。在第二種情況下,它從幾乎是最後一個的訓練錯誤開始......所以它基本上與一個小批量收斂。我開始認爲我沒有在這一行中正確計算成本: avg_cost = c /(minibatch_x。形狀[0])
您是否嘗試將您的行'cost = tf.nn.l2_loss(pred-y,name =「squared_error_cost」)'改爲'cost = tf.nn.square(tf.sub(pred,y))? – Kashyap
您可以在培訓過程中打印準確度(正確分類樣本的百分比)嗎? –
@Kashyap:打印成本時,我得到一個「非空格式字符串傳遞給object .__ format__」錯誤,而且我似乎無法解決這個問題。 – Darkobra