3
我正在編寫一些特定於平臺的優化,並且我意識到可以在主機代碼中解析供應商字符串並使用-D
選項將其發送給內核的事實,在沒有主機參與的情況下直接在內核中檢測供應商可能會更方便(這樣即使不訪問主機源代碼也可以優化內核,...)。在內核代碼中檢測OpenCL設備廠商
到目前爲止,我想出了以下內容:
#ifdef __NV_CL_C_VERSION
/**
* @def NVIDIA
* @brief defined when compiling on NVIDIA GPUs
*/
#define NVIDIA
#endif // __NV_CL_C_VERSION
#if defined(__WinterPark__) || defined(__BeaverCreek__) || defined(__Turks__) || \
defined(__Caicos__) || defined(__Tahiti__) || defined(__Pitcairn__) || \
defined(__Capeverde__) || defined(__Cayman__) || defined(__Barts__) || \
defined(__Cypress__) || defined(__Juniper__) || defined(__Redwood__) || \
defined(__Cedar__) || defined(__ATI_RV770__) || defined(__ATI_RV730__) || \
defined(__ATI_RV710__) || defined(__Loveland__) || defined(__GPU__) || \
defined(__Hawaii__)
#define AMD
/**
* @def AMD
* @brief defined when compiling on AMD GPUs
* @note This list was originally found at https://github.com/magnumripper/JohnTheRipper/wiki/Predefined-macros-in-OpenCL-(standard-and-proprietary) and copied shamelessly. It is most definitely incomplete and contains the troubling __GPU__.
* @note AMD also defines __CPU__ when compiling for CL_DEVICE_TYPE_CPU.
*/
#endif // ...
任何補充或更正?任何人都知道英特爾的定義