2014-02-27 42 views
3

這是我第一個使用Dynamic Parallelism的程序,我無法編譯代碼。我需要的是能夠在大學做研究的項目運行這個和任何幫助將非常感激:CUDA Dynamic Parallelism MakeFile

我收到以下錯誤:

/cm/shared/apps/cuda50/toolkit/5.0.35/bin/nvcc -m64 -dc -gencode arch=compute_35,code=sm_35 -rdc=true -dlink -po maxrregcount=16 -I/cm/shared/apps/cuda50/toolkit/5.0.35 -I. -I.. -I../../common/inc -o BlackScholes.o -c BlackScholes.cu 
g++ -m64 -I/cm/shared/apps/cuda50/toolkit/5.0.35 -I. -I.. -I../../common/inc -o BlackScholes_gold.o -c BlackScholes_gold.cpp 
g++ -m64 -o BlackScholes BlackScholes.o BlackScholes_gold.o -L/cm/shared/apps/cuda50/toolkit/5.0.35/lib64 -lcudart -lcudadevrt 
BlackScholes.o: In function `__sti____cudaRegisterAll_47_tmpxft_000059cb_00000000_6_BlackScholes_cpp1_ii_c58990ec()': 
tmpxft_000059cb_00000000-3_BlackScholes.cudafe1.cpp:(.text+0x1354): undefined reference to `__cudaRegisterLinkedBinary_47_tmpxft_000059cb_00000000_6_BlackScholes_cpp1_ii_c58990ec' 
collect2: ld returned 1 exit status 
make: *** [BlackScholes] Error 1 

我有一個CPP文件,一個CU文件和一個cuh文件。我的makefile文件的重要部分的下面:

# CUDA code generation flags 
#GENCODE_SM10 := -gencode arch=compute_10,code=sm_10 
GENCODE_SM20  := -gencode arch=compute_20,code=sm_20 
GENCODE_SM30  := -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 
GENCODE_SM35  := -gencode arch=compute_35,code=sm_35 
#GENCODE_FLAGS := $(GENCODE_SM10) $(GENCODE_SM20) $(GENCODE_SM30) 
GENCODE_FLAGS := $(GENCODE_SM35) 

# OS-specific build flags 
ifneq ($(DARWIN),) 
     LDFLAGS := -Xlinker -rpath $(CUDA_LIB_PATH) -L$(CUDA_LIB_PATH) -lcudart -lcudadevrt 
     CCFLAGS := -arch $(OS_ARCH) 
else 
    ifeq ($(OS_SIZE),32) 
     LDFLAGS := -L$(CUDA_LIB_PATH) -lcudart -lcudadevrt 
     CCFLAGS := -m32 
    else 
     LDFLAGS := -L$(CUDA_LIB_PATH) -lcudart -lcudadevrt 
     CCFLAGS := -m64 
    endif 
endif 

# OS-architecture specific flags 
ifeq ($(OS_SIZE),32) 
     NVCCFLAGS := -m32 -dc 
else 
     NVCCFLAGS := -m64 -dc 
endif 

# Debug build flags 
ifeq ($(dbg),1) 
     CCFLAGS += -g 
     NVCCFLAGS += -g -G 
     TARGET := debug 
else 
     TARGET := release 
endif 


# Common includes and paths for CUDA 
INCLUDES  := -I$(CUDA_INC_PATH) -I. -I.. -I../../common/inc 

# Additional parameters 
MAXRREGCOUNT := -po maxrregcount=16 

# Target rules 
all: build 

build: BlackScholes 

BlackScholes.o: BlackScholes.cu 
     $(NVCC) $(NVCCFLAGS) $(EXTRA_NVCCFLAGS) $(GENCODE_FLAGS) -rdc=true -dlink $(MAXRREGCOUNT) $(INCLUDES) -o [email protected] -c $< 

BlackScholes_gold.o: BlackScholes_gold.cpp 
     $(GCC) $(CCFLAGS) $(INCLUDES) -o [email protected] -c $< 

BlackScholes: BlackScholes.o BlackScholes_gold.o 
     $(GCC) $(CCFLAGS) -o [email protected] $+ $(LDFLAGS) $(EXTRA_LDFLAGS) 
     mkdir -p ../../bin/$(OSLOWER)/$(TARGET) 
     cp [email protected] ../../bin/$(OSLOWER)/$(TARGET) 
    enter code here 

run: build 
     ./BlackScholes 

回答

3

當使用主機連接(g++)爲您的可執行文件的最後鏈接,並使用重定位裝置的代碼(nvcc -dc)時,有必要做一箇中間裝置的代碼鏈接步驟。

documentation

If you want to invoke the device and host linker separately, you can do: 

nvcc –arch=sm_20 –dc a.cu b.cu 
nvcc –arch=sm_20 –dlink a.o b.o –o link.o 
g++ a.o b.o link.o –L<path> -lcudart 

既然你指定的編譯行-dc,你得到一個唯一的編譯操作(就像您指定了-c到g ++)。

下面是修改/冷凝Makefile應該告訴你什麼是涉及:

GENCODE_SM35  := -gencode arch=compute_35,code=sm_35 
GENCODE_FLAGS := $(GENCODE_SM35) 

LDFLAGS := -L/usr/local/cuda/lib64 -lcudart -lcudadevrt 
CCFLAGS := -m64 

NVCCFLAGS := -m64 -dc 

NVCC := nvcc 
GCC := g++ 

# Debug build flags 
ifeq ($(dbg),1) 
     CCFLAGS += -g 
     NVCCFLAGS += -g -G 
     TARGET := debug 
else 
     TARGET := release 
endif 


# Common includes and paths for CUDA 
INCLUDES  := -I/usr/local/cuda/include -I. -I.. 

# Additional parameters 
MAXRREGCOUNT := -po maxrregcount=16 

# Target rules 
all: build 

build: BlackScholes 

BlackScholes.o: BlackScholes.cu 
     $(NVCC) $(NVCCFLAGS) $(EXTRA_NVCCFLAGS) $(GENCODE_FLAGS) $(MAXRREGCOUNT) $(INCLUDES) -o [email protected] $< 
     $(NVCC) -dlink $(GENCODE_FLAGS) $(MAXRREGCOUNT) -o bs_link.o [email protected] 

BlackScholes_gold.o: BlackScholes_gold.cpp 
     $(GCC) $(CCFLAGS) $(INCLUDES) -o [email protected] -c $< 

BlackScholes: BlackScholes.o BlackScholes_gold.o bs_link.o 
     $(GCC) $(CCFLAGS) -o [email protected] $+ $(LDFLAGS) $(EXTRA_LDFLAGS) 

run: build 
     ./BlackScholes 
+0

我需要追加「-lcudadevrt」到的字符串「命令」(即「NVCC -lcudadevrt」)和它的工作之後。非常感謝您的幫助Robert。 :-) –

+0

是的,我認爲還需要爲中間步驟添加,這意味着我的Makefile可以通過爲中間('-dlink')鏈接步驟指定'$(LDFLAGS)'來修改。 –