2017-06-20 68 views
0

我的一位朋友實現了一個實際工作的稀疏版本的torch.bmm,但是當我嘗試一個測試時,我有一個運行時錯誤(與此實現無關),我不明白。我已經看到了幾個關於如何但無法找到解決方案的主題。下面是代碼,並且錯誤:產生CUDA運行時錯誤的火炬代碼

if __name__ == "__main__": 
    tmp = torch.zeros(1).cuda() 
    batch_csr = BatchCSR() 
    sparse_bmm = SparseBMM() 

    i=torch.LongTensor([[0,5,8], [1,5,8], [2,5,8]]) 
    v=torch.FloatTensor([4,3,8]) 
    s=torch.Size([3,500,500]) 

    indices, values, size = i,v,s 

    a_ = torch.sparse.FloatTensor(indices, values, size).cuda().transpose(2, 1) 
    batch_size, num_nodes, num_faces = a_.size() 

    a = a_.to_dense() 

    for _ in range(10): 
     b = torch.randn(batch_size, num_faces, 16).cuda() 
     torch.cuda.synchronize() 
     time1 = time.time() 
     result = torch.bmm(a, b) 
     torch.cuda.synchronize() 
     time2 = time.time() 
     print("{} CuBlas dense bmm".format(time2 - time1)) 

     torch.cuda.synchronize() 
     time1 = time.time() 
     col_ind, col_ptr = batch_csr(a_.indices(), a_.size()) 
     my_result = sparse_bmm(a_.values(), col_ind, col_ptr, a_.size(), b) 
     torch.cuda.synchronize() 
     time2 = time.time() 
     print("{} My sparse bmm".format(time2 - time1)) 

     print("{} Diff".format((result-my_result).abs().max())) 

和錯誤:

Traceback (most recent call last): 
    File "sparse_bmm.py", line 72, in <module> 
    b = torch.randn(3, 500, 16).cuda() 
    File "/home/bizeul/virtual_env/lib/python2.7/site-packages/torch/_utils.py", line 65, in _cuda 
    return new_type(self.size()).copy_(self, async) 
RuntimeError: cuda runtime error (59) : device-side assert triggered at /b/wheel/pytorch-src/torch/lib/THC/generic/THCTensorCopy.c:18 

當用命令CUDA_LAUNCH_BLOCKING = 1運行,我得到的錯誤:

/b/wheel/pytorch-src/torch/lib/THC/THCTensorIndex.cu:121: void indexAddSmallIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2]: block: [0,0,0], thread: [0,0,0] Assertion `dstIndex < dstAddDimSize` failed. 
THCudaCheck FAIL file=/b/wheel/pytorch-src/torch/lib/THCS/generic/THCSTensorMath.cu line=292 error=59 : device-side assert triggered 
Traceback (most recent call last): 
    File "sparse_bmm.py", line 69, in <module> 
    a = a_.to_dense() 
RuntimeError: cuda runtime error (59) : device-side assert triggered at /b/wheel/pytorch-src/torch/lib/THCS/generic/THCSTensorMath.cu:292 
+0

好吧,所以cuda在技術上本質上是異步的,所以觸發的斷言錯誤不會帶有堆棧跟蹤。 嘗試運行腳本像這樣在你的終端: 'CUDA_LAUNCH_BLOCKING = 1條蟒蛇your_script.py' 並更新你的問題 – entrophy

+0

謝謝,我編輯我的職務 – Gericault

+0

那麼,什麼是你的問題正是* *? – talonmies

回答

1

該指數你傳遞來創建稀疏張量是不正確的。

這裏是應該的:

i = torch.LongTensor([[0, 1, 2], [5, 5, 5], [8, 8, 8]])

如何創建一個稀疏張量:

讓我們來簡單的例子。比方說,我們希望下面的張量:

0 0 0 2 0 
    0 0 0 0 0 
    0 0 0 0 20 
[torch.cuda.FloatTensor of size 3x5 (GPU 0)] 

正如你可以看到,數(2)需要在稀疏張量的(0,3)位置。數字(20)需要位於(2,4)位置。

爲了創建這一點,我們的指數張量應該是這樣的

[[0 , 2], 
[3 , 4]] 

而且,現在的代碼創建上述稀疏張量:關於斷言

i=torch.LongTensor([[0, 2], [3, 4]]) 
v=torch.FloatTensor([2, 20]) 
s=torch.Size([3, 5]) 
a_ = torch.sparse.FloatTensor(indices, values, size).cuda() 

更多評論錯誤由cuda:

Assertion 'dstIndex < dstAddDimSize' failed.告訴我們,它的可能性很高,你有一個指數超出博unds。因此,無論何時您注意到,請查找您可能向任何張量提供了錯誤索引的地方。

+0

編輯:我的壞,我知道了! – Gericault