用法<<在C或CUDA

冪

// create arrays of 1M elements 
const int num_elements = 1<<20;

在下面的代碼

的含義是什麼？它是特定於CUDA還是可以在標準C中使用？

當我printf「版num_elements我num_elements == 1048576

這原來是2^20。那麼運算符是否是C中取冪指數的簡寫？

// This example demonstrates parallel floating point vector 
// addition with a simple __global__ function. 

#include <stdlib.h> 
#include <stdio.h> 


// this kernel computes the vector sum c = a + b 
// each thread performs one pair-wise addition 
__global__ void vector_add(const float *a, 
          const float *b, 
          float *c, 
          const size_t n) 
{ 
    // compute the global element index this thread should process 
    unsigned int i = threadIdx.x + blockDim.x * blockIdx.x; 

    // avoid accessing out of bounds elements 
    if(i < n) 
    { 
    // sum elements 
    c[i] = a[i] + b[i]; 
    } 
} 


int main(void) 
{ 
    // create arrays of 1M elements 
    const int num_elements = 1<<20; 

    // compute the size of the arrays in bytes 
    const int num_bytes = num_elements * sizeof(float); 

    // points to host & device arrays 
    float *device_array_a = 0; 
    float *device_array_b = 0; 
    float *device_array_c = 0; 
    float *host_array_a = 0; 
    float *host_array_b = 0; 
    float *host_array_c = 0; 

    // malloc the host arrays 
    host_array_a = (float*)malloc(num_bytes); 
    host_array_b = (float*)malloc(num_bytes); 
    host_array_c = (float*)malloc(num_bytes); 

    // cudaMalloc the device arrays 
    cudaMalloc((void**)&device_array_a, num_bytes); 
    cudaMalloc((void**)&device_array_b, num_bytes); 
    cudaMalloc((void**)&device_array_c, num_bytes); 

    // if any memory allocation failed, report an error message 
    if(host_array_a == 0 || host_array_b == 0 || host_array_c == 0 || 
    device_array_a == 0 || device_array_b == 0 || device_array_c == 0) 
    { 
    printf("couldn't allocate memory\n"); 
    return 1; 
    } 

    // initialize host_array_a & host_array_b 
    for(int i = 0; i < num_elements; ++i) 
    { 
    // make array a a linear ramp 
    host_array_a[i] = (float)i; 

    // make array b random 
    host_array_b[i] = (float)rand()/RAND_MAX; 
    } 

    // copy arrays a & b to the device memory space 
    cudaMemcpy(device_array_a, host_array_a, num_bytes, cudaMemcpyHostToDevice); 
    cudaMemcpy(device_array_b, host_array_b, num_bytes, cudaMemcpyHostToDevice); 

    // compute c = a + b on the device 
    const size_t block_size = 256; 
    size_t grid_size = num_elements/block_size; 

    // deal with a possible partial final block 
    if(num_elements % block_size) ++grid_size; 

    // launch the kernel 
    vector_add<<<grid_size, block_size>>>(device_array_a, device_array_b, device_array_c, num_elements); 

    // copy the result back to the host memory space 
    cudaMemcpy(host_array_c, device_array_c, num_bytes, cudaMemcpyDeviceToHost); 

    // print out the first 10 results 
    for(int i = 0; i < 10; ++i) 
    { 
    printf("result %d: %1.1f + %7.1f = %7.1f\n", i, host_array_a[i], host_array_b[i], host_array_c[i]); 
    } 


    // deallocate memory 
    free(host_array_a); 
    free(host_array_b); 
    free(host_array_c); 

    cudaFree(device_array_a); 
    cudaFree(device_array_b); 
    cudaFree(device_array_c); 
}

來源

2011-11-04 smilingbuddha

<< <<是左移...檢查http://en.wikipedia.org/wiki/Logical_shift – Aziz

不，<<運算符是移位運算符。它需要一個數字的位，如00101，並將它們移到左邊的和之間，這會將數字乘以2的冪。所以x << y是x * 2^y。這是數字方式的結果存儲在計算機內部，這是二進制的。

例如，數1，當在2的補32位整數（它是）存儲：

00000000000000000000000000000001

當你

1 << 20

你正在採取一切在1的那個二進制表示中，並將它們移過20的地方：

00000000000100000000000000000000

這是2^20。這也適用於符號 - 幅度表示，1的補等

另一個例子，如果你採取的5表示：

00000000000000000000000000000101

，做，你

00000000000000000000000000001010

哪是10或5 * 2^1。

相反地，>>將除法由2的冪通過轉移到右 n位移動的位。

來源

2011-11-04 16:32:27

但是，C不需要2的補碼。 –

實際上2的補碼只適用於有符號整數，C不需要它。同時，對於_binary_數字，換檔工作會導致每次向左移動數值時乘以2（基數）。（同樣，如果左移十進制數字，則每次移位乘以10） – Arkku

數字不以2的補碼存儲，2的補碼是對一系列比特的解釋。你可以將float解釋爲int，使用shift操作符，它將起作用。結果不會是你所期望的。 – Femaref

這是一個轉變。在二進制中，取一個1，向左移動20個位置相當於乘以2^20

編輯：是的，它是標準的C和一個非常好的方式，使用戶清楚它是單個1在20位中，比寫更多int a = 1048576;

來源

2011-11-04 16:34:18

......這是標準C. –

（標準）C左移操作符<<通過將其左側的值的位（二進制數字）向左移動所示的「空格」由右邊的值（填充右邊的零），即1 < < 20導致二進制數，1後面跟着20個零。由於二進制是基數2，所以每次向左移動兩倍的值（乘以基數），即它等於乘以2的冪。

這個二進制數的屬性可以利用乘以2的冪乘正整數，比使用更一般的數學函數更快。（同樣在小學數學中，可以利用十進制數的類似性質來處理10 ...的功率）

來源

2011-11-04 16:44:23 Arkku

是的，謝謝。 =） – Arkku

用法<<在C或CUDA

回答

相關問題