2017-08-14 102 views
0

我想在圖像分類的張量流中做一個輸入管道,因此我想批量的圖像和相應的標籤。該Tensorflow文件表明,我們可以用tf.train.batch使輸入批次:tensorflow輸入管道返回多個值

train_batch, train_label_batch = tf.train.batch(
[train_image, train_image_label], 
batch_size=batch_size, 
num_threads=1, 
capacity=10*batch_size, 
enqueue_many=False, 
shapes=[[224,224,3], [len(labels),]], 
allow_smaller_final_batch=True 
) 

不過,我想這將是一個問題,如果我在圖形養活這樣的:

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=train_label_batch, logits=Model(train_batch))) 

問題是成本函數中的操作是否使圖像及其相應標籤出列,或者將它們分開返回?因此導致錯誤的圖像和標籤的訓練。

回答

1

您需要考慮幾件事情才能保留圖像和標籤的排序。

假設我們需要一個給我們圖像和標籤的函數。

def _get_test_images(_train=False): 


""" 
Gets the test images and labels as a batch 

Inputs: 
====== 
_train  : Boolean if images are from training set 
random_crop  : Boolean if random cropping is allowed 
random_flip   : Boolean if random horizontal flip is allowed 
distortion  : Boolean if distortions are allowed 

Outputs: 
======== 
images_batch : Batch of images containing BATCH_SIZE images at a time 
label_batch  : Batch of labels corresponding to the images in images_batch 
idx   : Batch of indexes of images 
""" 

#get images and labels 
_,_img_names,_img_class,index= _get_list(_train = _train) 

#total number of distinct images used for train will be equal to the images 
#fed in tf.train.slice_input_producer as _img_names 

img_path,label,idx = tf.train.slice_input_producer([_img_names,_img_class,index],shuffle=False) 

img_path,label,idx = tf.convert_to_tensor(img_path),tf.convert_to_tensor(label),tf.convert_to_tensor(idx) 
img_path = tf.cast(img_path,dtype=tf.string) 

#read file 
image_file = tf.read_file(img_path) 

#decode jpeg/png/bmp 
#tf.image.decode_image won't give shape out. So it will give error while resizing 
image = tf.image.decode_jpeg(image_file) 

#image preprocessing 
image = tf.image.resize_images(image, [IMG_DIM,IMG_DIM]) 

float_image = tf.cast(image,dtype=tf.float32) 

#subtracting mean and divide by standard deviation 
float_image = tf.image.per_image_standardization(float_image) 

#set the shape 
float_image.set_shape(IMG_SIZE) 
labels_original = tf.cast(label,dtype=tf.int32) 
img_index = tf.cast(idx,dtype=tf.int32) 

#parameters for shuffle 
batch_size = BATCH_SIZE 
min_fraction_of_examples_in_queue = 0.3 
num_preprocess_threads = 1 
num_examples_per_epoch = MAX_TEST_EXAMPLE 
min_queue_examples = int(num_examples_per_epoch * 
         min_fraction_of_examples_in_queue) 

images_batch, label_batch,idx = tf.train.batch(
     [float_image,label,img_index], 
     batch_size=batch_size, 
     num_threads=num_preprocess_threads, 
     capacity=min_queue_examples + 3 * batch_size) 

# Display the training images in the visualizer. 
tf.summary.image('images', images_batch) 

return images_batch, label_batch,idx 

這裏,tf.train.slice_input_producer([_img_names,_img_class,index],shuffle=False)是用在什麼地方,如果你把它shuffle=True洗牌將所有三個陣列中的協調一件有趣的事情。

第二件事是,num_preprocess_threads。只要你使用單線程去出隊操作,批處理就會以確定性的方式出現。但是不止一個線程會隨機洗牌。例如圖像0001.jpg如果True標籤爲1,則可能得到2或4.一旦它出列,它就是張量形式。 tf.nn.softmax_cross_entropy_with_logits不應該有這樣的張量問題。