與共享陣列內核和幾個當地的int:CUDA:使用緩存中的數據訪問本地變量?
__global__ void myKern()
{
gloablID = ....; //initialize gloabl thread ID
__shared__ int TMS[3]; //populate shared array in a simple way
if (globalID == 0)
{
TMS[0] = 0;
TMS[1] = 1;
TMS[2] = 2;
}
__syncthreads();
int val0 = 69;
int val1 = 36;
int val2 = 92;
int random_number = .... //use cuRand to get a random number between 0 and 3
int output = TMS[random_number];
//at this point, I want the variable "output" to be used to access one of my local ints.
//For example, if "output" = 2, I want to be able to print val2 to screen.
//In a fantasy computer language this might look something like:
//std::cout<< "val" + "output";
//I just want 92 to be printed to the screen.
???
}
這可能看起來像一個奇怪的算法,但如果我能做到這一點,就會讓我登記的速度與大尺寸的結合我的CUDA項目中的共享緩存。請不要暴力破解二進制解決方案,因爲我將使用一個大小爲2698的共享數組和33個局部變量。
能否請你澄清一下,你真的需要?你說:_if「輸出」= 2,我希望能夠將val2打印到screen_。然後你說:_std :: cout <<「val」+「output」_。看來你想處理一堆寄存器變量,因爲它們是一個獨特的數組和exloit數組指針算術? – JackOLantern
很抱歉,如果不清楚,很難解釋。也許這將澄清: – Jordan
如果輸出= 0,我想69打印。 – Jordan