2016-11-07 44 views
1

我對tensorflow有點新意,我試圖根據tfrecord文件創建一個輸入管道。文件中的每個條目都包含三個字段:2個字符串,其路徑爲2個圖像文件和1個浮動張量(示例中的標籤)。我能夠再次寫入和讀回信息,不幸的是我有一些問題,保持圖像和標籤同步。Tensorflow - 從tfrecord同步讀數

爲了節省我使用這段代碼

writer = tf.python_io.TFRecordWriter(output_tfrecord) 
... 
for index in shuffled_indexes: 
    example = tf.train.Example(
       features=tf.train.Features(
       feature={ 
       'label': tf.train.Feature(float_list=tf.train.FloatList(value=target.ravel().tolist()), 
       'image_1': tf.train.Feature(bytes_list=tf.train.BytesList(value=[image_1.encode()])), 
       'image_2': tf.train.Feature(bytes_list=tf.train.BytesList(value=[image_2.encode()])) 
       } 
     ) 
    ) 
    writer.write(example.SerializeToString()) 
writer.close() 

,並備份看了一遍這一個(在這個例子中,我在每個記錄忽略字段「IMAGE_2」)中的記錄:

def read_and_decode(filename, target_shape): 
# first construct a queue containing a list of filenames. 
# this lets a user split up there dataset in multiple files to keep 
# size down 
filename_queue = tf.train.string_input_producer(filename,num_epochs=None) 

#symbolic reader to read one example at a time 
reader = tf.TFRecordReader() 
_, serialized_example = reader.read(filename_queue) 
features = tf.parse_single_example(
    serialized_example, 
    # Defaults are not specified since both keys are required. 
    features={ 
     'label': tf.FixedLenFeature(target_shape, tf.float32), 
     'image_1': tf.FixedLenFeature([], tf.string), 
     'image_2': tf.FixedLenFeature([], tf.string) 
    } 
) 

img_filename_queue = tf.train.string_input_producer([features['image_1']],shuffle=False) 
image_reader = tf.WholeFileReader() 
_, image_file = image_reader.read(img_filename_queue) 
image = tf.image.decode_jpeg(image_file, channels=3) 
with tf.control_dependencies([image]): 
    label = features['label'] 


return image,label 

每對情侶圖像和標籤都是我的訓練集中的一個例子。如果我嘗試在單個會話中運行它們,則得到的結果不是同步結果,例如在tfrecord文件中只有兩個記錄的玩具示例中,圖像和標籤被交換:第一個標籤與第二個圖像交換,反之亦然。我的會話代碼

例子:

image,label = read_and_decode([outputfileName],result_shape) 

with tf.Session() as sess: 
    # Start the queue runners (input threads) 
    coord = tf.train.Coordinator() 
    threads = tf.train.start_queue_runners(sess=sess, coord=coord) 

    for i in range(2): 
     img,trg = sess.run([image,label]) 
     ioUtils.visualizeLabel(img,trg) 

# When done, ask the threads to stop. 
coord.request_stop() 
# Wait for threads to finish. 
coord.join(threads) 

什麼,我做錯了什麼建議?

回答

0

好吧,我想通了,問題是

img_filename_queue = tf.train.string_input_producer([features['image_1']],shuffle=False) 

的string_input_producer是與piepline其餘搞亂了。寫入read_and_decode的正確方法是

def read_and_decode_tfrecord(filename, target_shape): 
# first construct a queue containing a list of filenames. 
# this lets a user split up there dataset in multiple files to keep 
# size down 
filename_queue = tf.train.string_input_producer(filename,num_epochs=None) 

#symbolic reader to read one example at a time 
reader = tf.TFRecordReader() 
_, serialized_example = reader.read(filename_queue) 
features = tf.parse_single_example(
    serialized_example, 
    # Defaults are not specified since both keys are required. 
    features={ 
     'label': tf.FixedLenFeature(target_shape, tf.float32), 
     'image_1': tf.FixedLenFeature([], tf.string), 
     'image_2': tf.FixedLenFeature([], tf.string) 
    } 
) 

image_file = tf.read_file(image_path_1) 
image = tf.image.decode_jpeg(image_file, channels=3) 
with tf.control_dependencies([image]): 
    label = features['label'] 

return image,label 
+0

如果您自己解決了問題,請記住將您的答案標記爲已接受。 – nessuno