3
我最近在CUDA v7.5,CUDNN v3和Visual Studio 2013社區版上使用了Windows 10。Theano:採用CPU與GPU矩陣點時的差異與Numpy
import numpy as np
import time
import theano
A = np.random.rand(10000,10000).astype(theano.config.floatX)
B = np.random.rand(10000,10000).astype(theano.config.floatX)
np_start = time.time()
AB = A.dot(B)
np_end = time.time()
X,Y = theano.tensor.matrices('XY')
mf = theano.function([X,Y],X.dot(Y))
t_start = time.time()
tAB = mf(A,B)
t_end = time.time()
print "NP time: %f[s], theano time: %f[s] (times should be close when run on CPU!)" %(
np_end-np_start, t_end-t_start)
print "Result difference: %f" % (np.abs(AB-tAB).max(),)
我得到了以下結果:
G:\ml\Theano\Projects>python Test.py
NP time: 10.585000[s], theano time: 10.587000[s] (times should be close when run on CPU!)
Result difference: 0.000000
G:\ml\Theano\Projects>python Test.py
Using gpu device 0: GeForce GTX 970 (CNMeM is disabled)
NP time: 10.838000[s], theano time: 1.294000[s] (times should be close when run on CPU!)
Result difference: 0.022461
正如你所看到的,是爲了驗證它正常工作,我同時使用CPU和GPU測試從Theano Windows install page下面的代碼在GPU上進行計算時有0.022的顯着差異。只是想知道這是預期的還是我做錯了什麼。
這裏是我的.theanorc:
[global]
device = gpu
floatX = float32
[nvcc]
fastmath = True
請在CPU上運行過,當你使用FLOAT32? –
是的,我保持floatX = float32線相同,只是換出device = cpu/gpu –