我不明白到底發生了什麼下面幾行:共享內存指針運算
unsigned char *membershipChanged = (unsigned char *)sharedMemory;
和float *clusters = (float *)(sharedMemory + blockDim.x);
我認爲在#1 sharedMemory
實際上被重命名爲membershipChanged
,但爲什麼要將blockDim
添加到sharedMemory
指針。這個地址在哪裏?
sharedMemory
與extern __shared__ char sharedMemory[];
我在CUDA kmeans implementation中發現的代碼創建的。
void find_nearest_cluster(int numCoords,
int numObjs,
int numClusters,
float *objects, // [numCoords][numObjs]
float *deviceClusters, // [numCoords][numClusters]
int *membership, // [numObjs]
int *intermediates)
{
extern __shared__ char sharedMemory[];
// The type chosen for membershipChanged must be large enough to support
// reductions! There are blockDim.x elements, one for each thread in the
// block.
unsigned char *membershipChanged = (unsigned char *)sharedMemory;
float *clusters = (float *)(sharedMemory + blockDim.x);
membershipChanged[threadIdx.x] = 0;
// BEWARE: We can overrun our shared memory here if there are too many
// clusters or too many coordinates!
for (int i = threadIdx.x; i < numClusters; i += blockDim.x) {
for (int j = 0; j < numCoords; j++) {
clusters[numClusters * j + i] = deviceClusters[numClusters * j + i];
}
}
.....
和'extern __shared__ char sharedMemory [];'是否足以告訴cuda分配所有可用的共享內存? – Framester
編號共享內存在內核啓動時使用以下語法動態分配:kernel <<< num_blocks,num_threads,num_bytes_smem >>>(args ...)。指向動態分配的消息的指針位於extern共享變量上。 –