你只需要在內核指令添加了「存在(網格)」的條款。
下面是一個帶有修復程序的例子,以及其他一些東西,比如更新數據以便它可以打印在主機上。
% cat test.f90
program Test
implicit none
type dt
integer :: n
real, dimension(:), allocatable :: xm
end type dt
type(dt) :: grid
integer :: i
grid%n = 10
allocate(grid%xm(grid%n))
!$acc enter data copyin(grid)
!$acc enter data create(grid%xm)
!$acc kernels present(grid)
do i = 1, grid%n
grid%xm(i) = i * i
enddo
!$acc end kernels
!$acc update host(grid%xm)
print*,grid%xm
!$acc exit data delete(grid%xm, grid)
deallocate(grid%xm)
end program Test
% pgf90 -acc test.f90 -Minfo=accel -ta=tesla -V16.10; a.out
test:
16, Generating enter data copyin(grid)
17, Generating enter data create(grid%xm(:))
18, Generating present(grid)
19, Loop is parallelizable
Accelerator kernel generated
Generating Tesla code
19, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
23, Generating update self(grid%xm(:))
1.000000 4.000000 9.000000 16.00000
25.00000 36.00000 49.00000 64.00000
81.00000 100.0000
請注意,PGI 17.7將在Fortran中包含beta支持真正的深拷貝。與以上手動深拷貝相反。下面是使用真深層副本的示例:
% cat test_deep.f90
program Test
implicit none
type dt
integer :: n
real, dimension(:), allocatable :: xm
end type dt
type(dt) :: grid
integer :: i
grid%n = 10
allocate(grid%xm(grid%n))
!$acc enter data copyin(grid)
!$acc kernels present(grid)
do i = 1, grid%n
grid%xm(i) = i * i
enddo
!$acc end kernels
!$acc update host(grid)
print*,grid%xm
!$acc exit data delete(grid)
deallocate(grid%xm)
end program Test
% pgf90 -acc test_deep.f90 -Minfo=accel -ta=tesla:deepcopy -V17.7 ; a.out
test:
16, Generating enter data copyin(grid)
17, Generating present(grid)
18, Loop is parallelizable
Accelerator kernel generated
Generating Tesla code
18, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
22, Generating update self(grid)
1.000000 4.000000 9.000000 16.00000
25.00000 36.00000 49.00000 64.00000
81.00000 100.0000
根據文檔(PGI OpenACC的指南,V2015和v2017):派生類型,其中,所導出的類型包含分配部件,的陣列還沒有被測試並且不應該考慮支持此版本。 https://stackoverflow.com/questions/45233207/allocatable-arrays-in-cuda-fortran-device-data-structures#comment77460575_45233207 –
事實證明,註釋掉pcreate(grid%xm)的創建會使程序正常運行。這是否意味着現在支持深度複製? – danny
*「未經測試且不應被視爲支持」* ...用於陣列的位。你有一個單一的變量,所以我不知道,嘗試在手冊中搜索。 –