2016-12-07 36 views
2

我有一個使用OpenMP的C++項目,我嘗試使用Blue Gene/Q上的LLVM進行編譯。還有一個功能,剝離下來的,看起來是這樣的:「私有變量不能被還原」,儘管該變量是在SIMD塊之外定義的

template <typename FT, int veclen> 
inline void xmyNorm2Spinor(FT *res, 
          FT *x, 
          FT *y, 
          double &n2res, 
          int n, 
          int n_cores, 
          int n_simt, 
          int n_blas_simt) { 
#if defined(__GNUG__) && !defined(__INTEL_COMPILER) 
    double norm2res __attribute__((aligned(QPHIX_LLC_CACHE_ALIGN))) = 0; 
#else 
    __declspec(align(QPHIX_LLC_CACHE_ALIGN)) double norm2res = 0; 
#endif 

#pragma omp parallel shared(norm_array) 
    { 
     // […] 
     if (smtid < n_blas_simt) { 
      // […] 

      double lnorm = 0; 

//#pragma prefetch x,y,res 
//#pragma vector aligned(x,y,res) 
#pragma omp simd aligned(res, x, y : veclen) reduction(+ : lnorm) 
      for (int i = low; i < hi; i++) { 
       res[i] = x[i] - y[i]; 
       double tmpd = (double)res[i]; 
       lnorm += (tmpd * tmpd); 
      } 
      // […] 
     } 
    } 
    // […] 
} 

的錯誤是這樣就在這裏:

In file included from /homec/hbn28/hbn28e/Sources/qphix/tests/timeDslashNoQDP.cc:6: 
In file included from /homec/hbn28/hbn28e/Sources/qphix/include/qphix/blas.h:8: 
/homec/hbn28/hbn28e/Sources/qphix/include/qphix/blas_c.h:156:54: error: private variable cannot be reduction 
#pragma omp simd aligned(res,x,y:veclen) reduction(+:lnorm) 
                ^
/homec/hbn28/hbn28e/Sources/qphix/include/qphix/blas_c.h:151:12: note: predetermined as private 
           double lnorm=0; 
            ^

由於外omp parallel塊,變量lnorm爲每個線程定義。然後有一個額外的SIMD部分,每個線程使用一個SIMD通道。減少應該在線程內完成,所以變量的範圍看起來是正確的。儘管如此,編譯器並不需要它。

這裏有什麼問題?

回答

0

的問題似乎是,通過與OpenMP的reduction()條款上它的參數變量規定的要求omp parallel塊衝突連接到lnorm變量私人屬性(即使lnorm不是私人相對於reduction()子句適用的嵌套omp simd塊)。

您可以嘗試通過提取lnorm計算代碼到它自己的功能,解決該問題:

template <typename FT, int veclen> 
inline double compute_res_and_lnorm(FT *res, 
          FT *x, 
          FT *y, 
          int low, 
          int hi) 
{ 
    double lnorm = 0; 

#pragma omp simd aligned(res, x, y : veclen) reduction(+ : lnorm) 
    for (int i = low; i < hi; i++) { 
     res[i] = x[i] - y[i]; 
     double tmpd = (double)res[i]; 
     lnorm += (tmpd * tmpd); 
    } 
    return lnorm; 
} 

template <typename FT, int veclen> 
inline void xmyNorm2Spinor(FT *res, 
          FT *x, 
          FT *y, 
          double &n2res, 
          int n, 
          int n_cores, 
          int n_simt, 
          int n_blas_simt) { 
#if defined(__GNUG__) && !defined(__INTEL_COMPILER) 
    double norm2res __attribute__((aligned(QPHIX_LLC_CACHE_ALIGN))) = 0; 
#else 
    __declspec(align(QPHIX_LLC_CACHE_ALIGN)) double norm2res = 0; 
#endif 

#pragma omp parallel shared(norm_array) 
    { 
     // […] 
     if (smtid < n_blas_simt) { 
      // […] 
      double lnorm = compute_res_and_lnorm(res, x, y, low, hi); 
      // […] 
     } 
    } 
    // […] 
}