我正在使用tensorflow對象檢測API來執行一些半實時對象檢測任務。 圖像將以2張/秒的速度由相機拍攝。蝕刻圖像將被裁剪成4張小圖像,因此總共需要處理8張圖像/秒。加載圖像進行檢測的更有效的方法
我的檢測模型已導出到凍結圖形(.pb文件)並加載到GPU內存中。然後我將圖像加載到numpy數組,將它們饋入我的模型中。
檢測本身只需要約0.1秒/圖像,但是,加載每個圖像大約需要0.45秒。
我使用的腳本是從對象檢測api(link)提供的代碼示例中修改的,它讀取每個圖像並將它們轉換爲numpy數組,然後輸入檢測模型。這個過程中最耗時的一部分是load_image_into_numpy_array
,大約需要0.45秒。
的腳本如下:
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
import timeit
import scipy.misc
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
from utils import label_map_util
from utils import visualization_utils as vis_util
# Path to frozen detection graph. This is the actual model that is used for the
# object detection.
PATH_TO_CKPT = 'animal_detection.pb'
# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'animal_label_map.pbtxt')
NUM_CLASSES = 1
detection_graph = tf.Graph()
with detection_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def,name='')
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map,
max_num_classes=NUM_CLASSES,
use_display_name=True)
category_index = label_map_util.create_category_index(categories)
def load_image_into_numpy_array(image):
(im_width, im_height) = image.size
return np.array(image.getdata()).reshape(
(im_height, im_width, 3)).astype(np.uint8)
# For the sake of simplicity we will use only 2 images:
# image1.jpg
# image2.jpg
# If you want to test the code with your images, just add path to the
# images to the TEST_IMAGE_PATHS.
PATH_TO_TEST_IMAGES_DIR = 'test'
TEST_IMAGE_PATHS = [
os.path.join(PATH_TO_TEST_IMAGES_DIR,'image{}.png'.format(i)) for i in range(1, 10) ]
# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)
config = tf.ConfigProto()
config.graph_options.optimizer_options.global_jit_level = tf.OptimizerOptions.ON_1
with detection_graph.as_default():
with tf.Session(graph=detection_graph, config=config) as sess:
for image_path in TEST_IMAGE_PATHS:
start = timeit.default_timer()
image = Image.open(image_path)
# the array based representation of the image will be used later in order to prepare the
# result image with boxes and labels on it.
image_np = load_image_into_numpy_array(image)
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(image_np, axis=0)
image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
end = timeit.default_timer()
print(end-start)
start = timeit.default_timer()
# Each box represents a part of the image where a particular object was detected.
boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
# Each score represent how level of confidence for each of the objects.
# Score is shown on the result image, together with the class label.
scores = detection_graph.get_tensor_by_name('detection_scores:0')
classes = detection_graph.get_tensor_by_name('detection_classes:0')
num_detections = detection_graph.get_tensor_by_name('num_detections:0')
# Actual detection.
(boxes, scores, classes, num_detections) = sess.run(
[boxes, scores, classes, num_detections],
feed_dict={image_tensor: image_np_expanded})
stop = timeit.default_timer()
print (stop - start)
# Visualization of the results of a detection.
vis_util.visualize_boxes_and_labels_on_image_array(
image_np,
np.squeeze(boxes),
np.squeeze(classes).astype(np.int32),
np.squeeze(scores),
category_index,
use_normalized_coordinates=True,
line_thickness=2)
我想更有效的方式來加載由相機產生的圖像,首先想到的是要避免numpy的陣列,並嘗試使用tensorflow本土的方式來加載圖像,但我不知道從哪裏開始,因爲我對tensorflow很陌生。
如果我能找到一些tensorflow的方式來加載圖像,也許我可以將4個圖像分爲1批次,並將它們送入我的模型,以便我可以在速度上得到一些改進。
一個不成熟的想法是嘗試將從1張原始圖像裁剪的4張小圖像保存到tf_record文件中,並將tf_record文件作爲一個批處理文件加載到模型中,但我不知道如何實現這一點。
任何幫助將不勝感激。