可以在GPU上的浮點數組上進行數學運算的庫？

我想對GPU上的大型浮點數組執行基本的數學運算（加法，減法，除法，乘法），C++中是否有任何庫可以實現這一點？可以在GPU上的浮點數組上進行數學運算的庫？

例如，在僞代碼：

A = [1,2,3,...] 
B = [2,3,9,...] 
C = A+B //[3,5,12,...] 
D = A-B //[-1,-1,-6,...] 
E = A/B //[0.5,0.6,0.3,...] 
F = A*B //[2,6,27,...]

來源

2013-10-26 user1973386

看看Boost.Compute庫。它是一個C++類STL庫，允許您在GPU（或任何OpenCL兼容設備）上執行許多操作。與Thrust不同，它不僅限於NVIDIA GPU。

的源代碼是在這裏：https://github.com/boostorg/compute

來源

2013-10-26 18:04:33

我還沒有意識到這個庫（謝謝！）你玩過它嗎？有什麼建議？ – Escualo

@Arrieta：是的，我寫了;-)。它仍處於積極的發展階段，但它爲OpenCL提供了更加自然的C++接口。如果您向我發送有關您的用例的信息（通過電子郵件或其他方式），我可以提供更具體的建議。 –

嗨@KyleLutz，難道你不能如此善良的答案中顯示如何使用boost :: compute實現C = A + B或其他自定義二進制操作嗎？這將非常有幫助。提前致謝！ – Sergei

OpenCL是一個這樣的「庫」 - 在技術上它不是一個庫，但它自己的語言，基於C99。 OpenCL運行時系統將允許您在多個線程中創建在GPU（或CPU）上運行的線程，每個線程都負責計算的一小部分，並且可以配置要運行的線程數。

來源

2013-10-26 17:30:55

Thrust。

這個例子是從他們的網站：

#include <thrust/host_vector.h> 
#include <thrust/device_vector.h> 
#include <thrust/generate.h> 
#include <thrust/sort.h> 
#include <thrust/copy.h> 
#include <cstdlib> 

int main(void) 
{ 
    // generate 32M random numbers on the host 
    thrust::host_vector<int> h_vec(32 << 20); 
    thrust::generate(h_vec.begin(), h_vec.end(), rand); 

    // transfer data to the device 
    thrust::device_vector<int> d_vec = h_vec; 

    // sort data on the device (846M keys per second on GeForce GTX 480) 
    thrust::sort(d_vec.begin(), d_vec.end()); 

    // transfer data back to host 
    thrust::copy(d_vec.begin(), d_vec.end(), h_vec.begin()); 

    return 0; 
}

而且他們saxpy example更接近你的要求;看片段：

thrust::transform(X.begin(), X.end(), Y.begin(), Y.begin(), saxpy_functor(A));

來源

2013-10-26 17:31:33 Escualo

看起來像一個偉大的圖書館，唯一的問題，我可以看到的是，它只適用於NVidia卡。你知道任何其他類似於Thrust的庫，它們不僅限於NVidia卡嗎？ – user1973386

@ user1973386：我不熟悉任何。但是，用戶@KyleLutz已經回答說'Boost.Compute'可能是一種替代方案。我會看看。 – Escualo

VexCL是另一個庫，可以幫助你。從v1.0.0開始，它有OpenCL和CUDA後端。這裏是一個簡單的例子：

#include <vexcl/vexcl.hpp> 

int main() { 
    // Get all compute devices that support double precision. 
    vex::Context ctx(vex::Filter::DoublePrecision); 

    std::vector<double> a = {1, 2, 3, 4, 5}; 
    std::vector<double> b = {6, 7, 8, 9, 10}; 

    // Allocate memory and copy input data to compute devices. 
    vex::vector<double> A(ctx, a); 
    vex::vector<double> B(ctx, b); 

    // Do the computations. 
    vex::vector<double> C = A + B; 
    vex::vector<double> D = A - B; 
    vex::vector<double> E = A/B; 
    vex::vector<double> F = A * B; 

    // Get the results back to host. 
    vex::copy(C, a); 
}

來源

2013-11-21 10:55:25 ddemidov

可以在GPU上的浮點數組上進行數學運算的庫？

回答

相關問題