(在下面的問題中很長的錯誤信息TL; DR,這裏是具體問題:爲什麼測試代碼不能在TX1的GPU上執行,我需要什麼要做到這一點?)Theano:新Jetson TX1上GPU的使用情況
我剛剛閃過,並安裝了一個全新的與JetPack 2.3的Nvidia Jetson TX1。我試圖在TX1上安裝Theano,以便能夠使用板載GPU進行進一步的機器學習和神經網絡應用。
但是,我似乎無法讓GPU本身工作。
安裝Theano的從here採取:
sudo apt-get install python-numpy python-scipy python-dev python-pip python-nose g++ libblas-dev git
pip install --upgrade --no-deps git+git://github.com/Theano/Theano.git --user # Need Theano 0.8(not yet released) or more recent
安裝Theano版本是0.9.0.dev2,蟒蛇是2.7.12版本。
我從here使用的測試腳本:
from theano import function, config, shared, tensor
import numpy
import time
vlen = 10 * 30 * 768 # 10 x #cores x # threads per core
iters = 1000
rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], tensor.exp(x))
print(f.maker.fgraph.toposort())
t0 = time.time()
for i in range(iters):
r = f()
t1 = time.time()
print("Looping %d times took %f seconds" % (iters, t1 - t0))
print("Result is %s" % (r,))
if numpy.any([isinstance(x.op, tensor.Elemwise) and
('Gpu' not in type(x.op).__name__)
for x in f.maker.fgraph.toposort()]):
print('Used the cpu')
else:
print('Used the gpu')
當運行建議:
THEANO_FLAGS=device=cuda0 python gpu_tutorial1.py
我得到如下回應,充滿了錯誤,警告,以及執行的CPU上,而比GPU:
ERROR (theano.gpuarray): pygpu was configured but could not be imported
Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 21, in <module>
import pygpu
ImportError: No module named pygpu
WARNING (theano.gof.cmodule): OPTIMIZATION WARNING: Theano was not able to find the default g++ parameters. This is needed to tune the compilation to your specific CPU. This can slow down the execution of Theano functions. Please submit the following lines to Theano's mailing list so that we can fix this problem:
['# 1 "<stdin>"\n', '# 1 "<built-in>"\n', '# 1 "<command-line>"\n', '# 1 "/usr/include/stdc-predef.h" 1 3 4\n', '# 1 "<command-line>" 2\n', '# 1 "<stdin>"\n', 'Using built-in specs.\n', 'COLLECT_GCC=/usr/bin/g++\n', 'Target: aarch64-linux-gnu\n', "Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.2' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libquadmath --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-arm64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-arm64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-arm64 --with-arch-directory=aarch64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-multiarch --enable-fix-cortex-a53-843419 --disable-werror --enable-checking=release --build=aarch64-linux-gnu --host=aarch64-linux-gnu --target=aarch64-linux-gnu\n", 'Thread model: posix\n', 'gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.2) \n', "COLLECT_GCC_OPTIONS='-E' '-v' '-shared-libgcc' '-mlittle-endian' '-mabi=lp64'\n", ' /usr/lib/gcc/aarch64-linux-gnu/5/cc1 -E -quiet -v -imultiarch aarch64-linux-gnu - -mlittle-endian -mabi=lp64 -fstack-protector-strong -Wformat -Wformat-security\n', 'ignoring nonexistent directory "/usr/local/include/aarch64-linux-gnu"\n', 'ignoring nonexistent directory "/usr/lib/gcc/aarch64-linux-gnu/5/../../../../aarch64-linux-gnu/include"\n', '#include "..." search starts here:\n', '#include <...> search starts here:\n', ' /usr/lib/gcc/aarch64-linux-gnu/5/include\n', ' /usr/local/include\n', ' /usr/lib/gcc/aarch64-linux-gnu/5/include-fixed\n', ' /usr/include/aarch64-linux-gnu\n', ' /usr/include\n', 'End of search list.\n', 'COMPILER_PATH=/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/:/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/\n', 'LIBRARY_PATH=/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/5/../../../aarch64-linux-gnu/:/usr/lib/gcc/aarch64-linux-gnu/5/../../../../lib/:/lib/aarch64-linux-gnu/:/lib/../lib/:/usr/lib/aarch64-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/aarch64-linux-gnu/5/../../../:/lib/:/usr/lib/\n', "COLLECT_GCC_OPTIONS='-E' '-v' '-shared-libgcc' '-mlittle-endian' '-mabi=lp64'\n"]
[Elemwise{exp,no_inplace}(<TensorType(float64, vector)>)]
Looping 1000 times took 12.736936 seconds
Result is [ 1.23178032 1.61879341 1.52278065 ..., 2.20771815 2.29967753
1.62323285]
Used the cpu
當我將設備標誌更改爲'gpu':
件THEANO_FLAGS=device=gpu python gpu_tutorial1.py
事情有所改善,在NVIDIA的Tegra的X1至少發現,雖然最終沒有被採用:
Using gpu device 0: NVIDIA Tegra X1 (CNMeM is disabled, cuDNN 5105)
WARNING (theano.gof.cmodule): OPTIMIZATION WARNING: Theano was not able to find the default g++ parameters. This is needed to tune the compilation to your specific CPU. This can slow down the execution of Theano functions. Please submit the following lines to Theano's mailing list so that we can fix this problem:
['# 1 "<stdin>"\n', '# 1 "<built-in>"\n', '# 1 "<command-line>"\n', '# 1 "/usr/include/stdc-predef.h" 1 3 4\n', '# 1 "<command-line>" 2\n', '# 1 "<stdin>"\n', 'Using built-in specs.\n', 'COLLECT_GCC=/usr/bin/g++\n', 'Target: aarch64-linux-gnu\n', "Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.2' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libquadmath --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-arm64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-arm64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-arm64 --with-arch-directory=aarch64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-multiarch --enable-fix-cortex-a53-843419 --disable-werror --enable-checking=release --build=aarch64-linux-gnu --host=aarch64-linux-gnu --target=aarch64-linux-gnu\n", 'Thread model: posix\n', 'gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.2) \n', "COLLECT_GCC_OPTIONS='-E' '-v' '-shared-libgcc' '-mlittle-endian' '-mabi=lp64'\n", ' /usr/lib/gcc/aarch64-linux-gnu/5/cc1 -E -quiet -v -imultiarch aarch64-linux-gnu - -mlittle-endian -mabi=lp64 -fstack-protector-strong -Wformat -Wformat-security\n', 'ignoring nonexistent directory "/usr/local/include/aarch64-linux-gnu"\n', 'ignoring nonexistent directory "/usr/lib/gcc/aarch64-linux-gnu/5/../../../../aarch64-linux-gnu/include"\n', '#include "..." search starts here:\n', '#include <...> search starts here:\n', ' /usr/lib/gcc/aarch64-linux-gnu/5/include\n', ' /usr/local/include\n', ' /usr/lib/gcc/aarch64-linux-gnu/5/include-fixed\n', ' /usr/include/aarch64-linux-gnu\n', ' /usr/include\n', 'End of search list.\n', 'COMPILER_PATH=/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/:/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/\n', 'LIBRARY_PATH=/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/5/../../../aarch64-linux-gnu/:/usr/lib/gcc/aarch64-linux-gnu/5/../../../../lib/:/lib/aarch64-linux-gnu/:/lib/../lib/:/usr/lib/aarch64-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/aarch64-linux-gnu/5/../../../:/lib/:/usr/lib/\n', "COLLECT_GCC_OPTIONS='-E' '-v' '-shared-libgcc' '-mlittle-endian' '-mabi=lp64'\n"]
[Elemwise{exp,no_inplace}(<TensorType(float64, vector)>)]
Looping 1000 times took 12.820628 seconds
Result is [ 1.23178032 1.61879341 1.52278065 ..., 2.20771815 2.29967753
1.62323285]
Used the cpu
我確實打算髮送警示線到Theano郵件列表,但是這個警告似乎與我目前的主要問題無關:爲什麼這個測試代碼不能在TX1的GPU上執行,我該如何做才能做到這一點?
對於未來的用戶,'device = gpu'是一個不推薦的選項,它現在是'cuda *'。 https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29 – tandem