2017-02-22 84 views
0

我有以下代碼,它從本地磁盤上的文件讀取一批10張圖像。讀取一批圖像很慢?

問題是代碼似乎運行速度很慢。大約需要5-6分鐘才能完成。包含圖像的目錄包含約。 25.000圖像。

代碼是否正確或者我是否做了一些愚蠢的事情?

import matplotlib.pyplot as plt 
import numpy as np 
from PIL import Image 
import tensorflow as tf 

image_width = 202 
image_height = 180 
num_channels = 3 

filenames = tf.train.match_filenames_once("./train/Resized/*.jpg") 

def read_image(filename_queue): 
    image_reader = tf.WholeFileReader() 
    key, image_filename = image_reader.read(filename_queue) 
    image = tf.image.decode_jpeg(image_filename) 
    image.set_shape((image_height, image_width, 3)) 

    return image 

def input_pipeline(filenames, batch_size, num_epochs=None): 
    filename_queue = tf.train.string_input_producer(filenames, num_epochs=num_epochs, shuffle=True) 
    input_image = read_image(filename_queue) 
    min_after_dequeue = 10000 
    capacity = min_after_dequeue + 3 * batch_size 
    image_batch = tf.train.shuffle_batch(
     [input_image], batch_size=batch_size, capacity=capacity, 
     min_after_dequeue=min_after_dequeue) 
    return image_batch 

new_batch = input_pipeline(filenames, 10) 

with tf.Session() as sess: 
    # Required to get the filename matching to run. 
    tf.global_variables_initializer().run() 

    # Coordinate the loading of image files. 
    coord = tf.train.Coordinator() 
    threads = tf.train.start_queue_runners(coord=coord) 

    b1 = sess.run(new_batch) 

    # Finish off the filename queue coordinator. 
    coord.request_stop() 
    coord.join(threads) 
+0

爲了縮小問題,您可以計算出您懷疑是罪魁禍首的每個函數調用,例如,時間'image_reader.read(..)'和'tf.image.decode_jpeg(..)'。 – kaufmanu

回答

1

將min_after_dequeue減小爲1000並嘗試一次。查看以下不同的min_after_dequeue值的時間表。

min_after_dequeue = 2000 => 2.1 sec to finish

min_after_dequeue = 100 => 0.13 sec to finish

不要下面進入時間表

from tensorflow.python.client import timeline 

run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE) 
run_metadata = tf.RunMetadata() 

b1 = sess.run(new_batch,options=run_options,run_metadata=run_metadata) 
# Create the Timeline object, and write it to a json 
tl = timeline.Timeline(run_metadata.step_stats) 
ctf = tl.generate_chrome_trace_format() 
with open('timelinestack1.json', 'w') as f: 
    f.write(ctf) 

此外,請確保您的所有圖片都有相同的大小,你提到的。否則,請在set_shape()之前的下一行使用。

image = tf.image.resize_images(imaged, [224, 224]) 

我希望我給出了合理的答案。

+0

謝謝,min_after_dequeue減少到1,000會顯着減少執行時間。所以,如果我正確理解這一點以返回一批10張圖像,實際上會讀取min_after_dequeue圖像的數量。然後從這些圖像中隨機抽取10張圖像。它是否正確? – OlavT

+1

是的,最好根據批量大小限制min_after_dequeue的值。 – hars