2017-07-31 97 views
0

我讀過Fortran派生類型的手動深度複製是可能的,但下面的簡單測試程序在運行時失敗;程序與PGI v16.10完全編譯。什麼出錯?帶有可分配的fortran openacc派生類型

program Test 

    implicit none 

    type dt 
     integer :: n 
     real, dimension(:), allocatable :: xm 
    end type dt 

    type(dt) :: grid 
    integer :: i 

    grid%n = 10 
    allocate(grid%xm(grid%n)) 

!$acc enter data copyin(grid) 
!$acc enter data pcreate(grid%xm) 

!$acc kernels 
    do i = 1, grid%n 
     grid%xm(i) = i * i 
    enddo 
!$acc end kernels 

    print*,grid%xm 

end program Test 

我得到的錯誤是:

call to cuStreamSynchronize returned error 700: Illegal address during kernel execution 
call to cuMemFreeHost returned error 700: Illegal address during kernel execution 
+0

根據文檔(PGI OpenACC的指南,V2015和v2017):派生類型,其中,所導出的類型包含分配部件,的陣列還沒有被測試並且不應該考慮支持此版本。 https://stackoverflow.com/questions/45233207/allocatable-arrays-in-cuda-fortran-device-data-structures#comment77460575_45233207 –

+0

事實證明,註釋掉pcreate(grid%xm)的創建會使程序正常運行。這是否意味着現在支持深度複製? – danny

+0

*「未經測試且不應被視爲支持」* ...用於陣列的位。你有一個單一的變量,所以我不知道,嘗試在手冊中搜索。 –

回答

1

你只需要在內核指令添加了「存在(網格)」的條款。

下面是一個帶有修復程序的例子,以及其他一些東西,比如更新數據以便它可以打印在主機上。

% cat test.f90 
program Test 

    implicit none 

    type dt 
     integer :: n 
     real, dimension(:), allocatable :: xm 
    end type dt 

    type(dt) :: grid 
    integer :: i 

    grid%n = 10 
    allocate(grid%xm(grid%n)) 

!$acc enter data copyin(grid) 
!$acc enter data create(grid%xm) 
!$acc kernels present(grid) 
    do i = 1, grid%n 
     grid%xm(i) = i * i 
    enddo 
!$acc end kernels 
!$acc update host(grid%xm) 
    print*,grid%xm 

!$acc exit data delete(grid%xm, grid) 
    deallocate(grid%xm) 

end program Test 

% pgf90 -acc test.f90 -Minfo=accel -ta=tesla -V16.10; a.out 
test: 
    16, Generating enter data copyin(grid) 
    17, Generating enter data create(grid%xm(:)) 
    18, Generating present(grid) 
    19, Loop is parallelizable 
     Accelerator kernel generated 
     Generating Tesla code 
     19, !$acc loop gang, vector(128) ! blockidx%x threadidx%x 
    23, Generating update self(grid%xm(:)) 
    1.000000  4.000000  9.000000  16.00000 
    25.00000  36.00000  49.00000  64.00000 
    81.00000  100.0000 

請注意,PGI 17.7將在Fortran中包含beta支持真正的深拷貝。與以上手動深拷貝相反。下面是使用真深層副本的示例:

% cat test_deep.f90 
program Test 

    implicit none 

    type dt 
     integer :: n 
     real, dimension(:), allocatable :: xm 
    end type dt 

    type(dt) :: grid 
    integer :: i 

    grid%n = 10 
    allocate(grid%xm(grid%n)) 

!$acc enter data copyin(grid) 
!$acc kernels present(grid) 
    do i = 1, grid%n 
     grid%xm(i) = i * i 
    enddo 
!$acc end kernels 
!$acc update host(grid) 
    print*,grid%xm 

!$acc exit data delete(grid) 
    deallocate(grid%xm) 

end program Test 

% pgf90 -acc test_deep.f90 -Minfo=accel -ta=tesla:deepcopy -V17.7 ; a.out 
test: 
    16, Generating enter data copyin(grid) 
    17, Generating present(grid) 
    18, Loop is parallelizable 
     Accelerator kernel generated 
     Generating Tesla code 
     18, !$acc loop gang, vector(128) ! blockidx%x threadidx%x 
    22, Generating update self(grid) 
    1.000000  4.000000  9.000000  16.00000 
    25.00000  36.00000  49.00000  64.00000 
    81.00000  100.0000 
+0

Mat,非常感謝,我將使用深層複製。爲什麼即使在v17.7中也需要現在的子句?不應該當我使用內核區域而沒有數據子句的情況下,我預計會發生網格的隱式複製/呈現,這同樣適用? – danny

+1

我認爲這是一個錯誤,因爲您是正確的,深層複製隱式複製應該只是工作。授予深度複製是新的和測試功能,所以這些類型的問題並不是意外的。 –