我目前正在學習tensorflow。我試圖用softmax模型來建立一個分類模型。 在程序中,我在CSV文件中將訓練數據集設置在兩列的左側和兩列右側的兩個標籤中。如:tensorflow分類
數據1,數據2,label1的,LABEL2
234,23,1,0#234比23大,所以label1的被標記爲1,和label2標記爲0
156,113,如圖1所示, 0
1,4,0,1
它的作用是對上述訓練數據集中基數最大的測試數據進行分類,成本值收斂到接近於零。
但是,我改變數據集來標記偶數,其目的是用偶數對測試數據進行分類,模型失敗,而成本是波動的。數據集是如下:
數據1,數據2,label1的,LABEL2
24,35,1,0#24是偶數,所以label1的被標記爲1,和label2標記爲0
156,553, 1,0
1,4,0,1
我的程序有問題嗎?爲什麼它能區分數據集中最大的數字,而偶數失敗呢?謝謝大家! 這裏是我的代碼:
import tensorflow as tf
import os
import numpy as np
def next_batch(num, data, labels):
idx = np.arange(0 , len(data))
np.random.shuffle(idx)
idx = idx[:num]
data_shuffle = [data[ i] for i in idx]
labels_shuffle = [labels[ i] for i in idx]
return np.asarray(data_shuffle), np.asarray(labels_shuffle)
dir_path = os.path.dirname(os.path.realpath(__file__))
filename = dir_path + "/classification.csv"
x = tf.placeholder(tf.float32, [None, 2])
y = tf.placeholder(tf.float32, [None, 2])
W = tf.Variable(tf.zeros([2, 2]))
b = tf.Variable(tf.zeros([2]))
pred =tf.add(tf.matmul(x, W),b)
cost=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred,labels=y))
optimizer = tf.train.GradientDescentOptimizer(0.1).minimize(cost)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
with open(filename) as inf:
# Skip header
next(inf)
result_array = np.shape(4)
for line in inf:
data1, data2,label1,label2= line.strip().split(",")
data1 = float(data1)
data2 = float(data2)
label1 = int(label1)
label2 = int(label2)
result_array = np.append(result_array, (data1,data2,label1,label2))
result_array=result_array.reshape(1000,4)
k=result_array[:,2:4]
gg=result_array[:,0:2]
for i in range(0,3000):
batch_xs, batch_ys = next_batch(200,gg,k)
h,cos=sess.run([optimizer, cost], feed_dict={x: batch_xs,y:batch_ys})
print(cos)
print(sess.run(pred,feed_dict={x:[[5,2],[4,9],[4,3],[5,2],[3,6],[30,21],[32,20],[3,4]]})) #testing data