如何優化Tensorflow模型服務

我訓練了Keras模型。現在我想通過Tensorflow服務部署它。因此，我把它轉換成SavedModel格式以這種方式：如何優化Tensorflow模型服務

K.set_learning_phase(0) 
    K._LEARNING_PHASE = tf.constant(0) 
    # sess = K.get_session() 
    if not os.path.exists(path): 
     os.mkdir(path) 
    export_path = os.path.join(
     tf.compat.as_bytes(path), 
     tf.compat.as_bytes(str(get_new_version(path=path, current_version=int(version))))) 
    print('Learning phase', K.learning_phase()) 
    print('Exporting trained model to', export_path) 
    builder = tf.saved_model.builder.SavedModelBuilder(export_path) 

    model_input = tf.saved_model.utils.build_tensor_info(model.input) 
    model_output = tf.saved_model.utils.build_tensor_info(model.output) 

    prediction_signature = (
     tf.saved_model.signature_def_utils.build_signature_def(
      inputs={'inputs': model_input}, 
      outputs={'output': model_output}, 
      method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME)) 

    with K.get_session() as sess: 

     builder.add_meta_graph_and_variables(
      sess=sess, tags=[tf.saved_model.tag_constants.SERVING], 
      signature_def_map={ 
       'predict': 
        prediction_signature, 
      }) 

     builder.save()

我開始（通過apt-get安裝安裝Tensorflow模型服務器）使用Tensorflow服務。但是我的模型大小爲376 MB（saved_model.pb和變量文件夾），預測時間很長（每個請求大約0.3秒），而當rps增加時，延遲降低。

所以，我想優化我的模型，是否有人知道一些技巧來做到這一點？

P.S.我在Keras的模型與save_model(model)保存在一起。

來源

2017-09-27 streamride

的幾點思考：

確保你沒有在你的服務模式留下任何隊列（例如FIFOQueue）。這些經常用於訓練以隱藏I/O延遲，但可能會影響服務性能。
考慮將批處理多個推理請求一起調用到TF模型/圖的單個調用中。請參閱--enable_batching，並通過--batching_parameters_file進行調整。
除了這些提示之外，您必須查看模型本身的結構。也許其他人對此有所瞭解。

- 克里斯（TF-投放小組）

來源

2017-09-27 17:29:46

謝謝你的回答，克里斯，你能告訴FIFIQueue模型？ – streamride

我是否正確，當我保存模型時，圖形會凍結？ – streamride

如何優化Tensorflow模型服務

回答

相關問題