1
我需要使用numexpr重寫此代碼,它計算矩陣數據[行x列]和向量[1 x列]的歐幾里得範數矩陣。歐幾里得範數使用numexpr
d = ((data-vec)**2).sum(axis=1)
該怎麼辦?也許還有另一種更快的方法?
我使用hdf5和數據矩陣來源於它的問題。 例如,此代碼給出錯誤:對象未對齊。
#naive numpy solution, can be parallel?
def test_bruteforce_knn():
h5f = tables.open_file(fileName)
t0= time.time()
d = np.empty((rows*batches,))
for i in range(batches):
d[i*rows:(i+1)*rows] = ((h5f.root.carray[i*rows:(i+1)*rows]-vec)**2).sum(axis=1)
print (time.time()-t0)
ndx = d.argsort()
print ndx[:k]
h5f.close()
#using some tricks (don't work error: objects are not aligned)
def test_bruteforce_knn():
h5f = tables.open_file(fileName)
t0= time.time()
d = np.empty((rows*batches,))
for i in range(batches):
d[i*rows:(i+1)*rows] = (np.einsum('ij,ij->i', h5f.root.carray[i*rows:(i+1)*rows],
h5f.root.carray[i*rows:(i+1)*rows])
+ np.dot(vec, vec)
-2 * np.dot(h5f.root.carray[i*rows:(i+1)*rows], vec))
print (time.time()-t0)
ndx = d.argsort()
print ndx[:k]
h5f.close()
使用numexpr:似乎numexpr不明白h5f.root.carray [我*行:第(i + 1)*行]必須重新分配?
import numexpr as ne
def test_bruteforce_knn():
h5f = tables.open_file(fileName)
t0= time.time()
d = np.empty((rows*batches,))
for i in range(batches):
ne.evaluate("sum((h5f.root.carray[i*rows:(i+1)*rows] - vec) ** 2, axis=1)")
print (time.time()-t0)
ndx = d.argsort()
print ndx[:k]
h5f.close()
謝謝,squared_euclidean_distances肯定工作更快,也看看我的更新,不能管理numexpr版本到工作。 – mrgloom