Android上的基準量化

我一直在用benchmark_model基準測試Exynos 7420上的tensorflow模型。我想按照Pete Warden's blog的速度測試量化，但還是無法通過量化代碼來編譯benchmark_model，因爲它們會破壞很多東西。Android上的基準量化

我已經按照這個stack overflow thread列出的指導原則：

// tensorflow /工具/基準/ BUILD cc_binary

deps = [":benchmark_model_lib", 
      "//tensorflow/contrib/quantization/kernels:quantized_ops", 
      ],

// tensorflow /的contrib /量化/粒/ BUILD：

deps = [ 
    "//tensorflow/contrib/quantization:cc_array_ops", 
    "//tensorflow/contrib/quantization:cc_math_ops", 
    "//tensorflow/contrib/quantization:cc_nn_ops", 
    #"//tensorflow/core", 
    #"//tensorflow/core:framework", 
    #"//tensorflow/core:lib", 
    #"//tensorflow/core/kernels:concat_lib_hdrs", 
    #"//tensorflow/core/kernels:conv_ops", 
    #"//tensorflow/core/kernels:eigen_helpers", 
    #"//tensorflow/core/kernels:ops_util", 
    #"//tensorflow/core/kernels:pooling_ops", 
    "//third_party/eigen3", 
    "@gemmlowp//:eight_bit_int_gemm", 
],

然後運行：

巴澤勒建立-c選擇--cxxopt =' - S td = gnu ++ 11' - crosstool_top = // external：android/crosstool --cpu = armeabi-v7a --host_crosstool_top = @ bazel_tools // tools/cpp：toolchain tensorflow/tools/benchmark：benchmark_model --verbose_failures

哪個（跟隨鏈接後的所有其他說明）成功與例外，它無法鏈接到pthread。

我試過在tensorflow/tensorflow.bzl tfcopts（）中刪除-lpthread，在tensorflow/tools/proto_text/BUILD和tensorflow/cc/BUILD中也是這樣。

def tf_copts(): 
    return (["-fno-exceptions", "-DEIGEN_AVOID_STL_ARRAY"] + 
      if_cuda(["-DGOOGLE_CUDA=1"]) + 
      if_android_arm(["-mfpu=neon"]) + 
      select({"//tensorflow:android": [ 
        "-std=c++11", 
        "-DMIN_LOG_LEVEL=0", 
        "-DTF_LEAN_BINARY", 
        "-O2", 
        ], 
        "//tensorflow:darwin": [], 
        "//tensorflow:ios": ["-std=c++11",], 
        #"//conditions:default": ["-lpthread"]})) 
        "//conditions:default": []}))

仍然收到鏈接錯誤。

external/androidndk/ndk/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/../lib/gcc/arm-linux-androideabi/4.9/../../../../arm-linux-androideabi/bin/ld: error: cannot find -lpthread 
collect2: error: ld returned 1 exit status

任何幫助非常感謝，我相當卡住。

ENV：

的Ubuntu 14.04
tensorflow提交＃4462
android_ndk_r11c
Android的SDK-linux的r24.4.1
的Python 2.7.12 ::連續分析，公司
./configure不支持GCP，HDFS或GPU

來源

2016-09-21 Dwight Crow

TF團隊轉錄GitHub answer from Andrew Harp。謝謝！！！

上述變化都是不必要的。你可以量化爲benchmark_model具有以下工作（或任何目標依賴於android_tensorflow_lib）：

混帳拉--recurse-子模塊（以獲得@gemmlowp庫，也可以克隆的git --recursive）
下面編輯以// tensorflow /核心/ BUILD

diff --git a/tensorflow/core/BUILD b/tensorflow/core/BUILD 
--- a/tensorflow/core/BUILD 
+++ b/tensorflow/core/BUILD 
@@ -713,8 +713,11 @@ cc_library(
# binary size (by packaging a reduced operator set) is a concern. 
cc_library(
    name = "android_tensorflow_lib", 
- srcs = if_android([":android_op_registrations_and_gradients"]), 
- copts = tf_copts(), 
+ srcs = if_android([":android_op_registrations_and_gradients", 
+      "//tensorflow/contrib/quantization:android_ops", 
+      "//tensorflow/contrib/quantization/kernels:android_ops", 
+      "@gemmlowp//:eight_bit_int_gemm_sources"]), 
+ copts = tf_copts() + ["-Iexternal/gemmlowp"], 
    linkopts = ["-lz"], 
    tags = [ 
     "manual",

只是測試，工程巨大。有趣的是，量化產生的圖形大小的四分之一，但推斷執行4-5倍像未經量化的圖一樣緩慢 - 似乎量子化操作仍在被優化。

來源

2016-09-23 04:18:53

好了，現在正在工作，是的，我們仍在優化量化的操作，所以不要將當前的速度作爲您可以獲得的最大值。 –

Android上的基準量化

回答

相關問題