2014-04-17 126 views
0

任何人都可以告訴我爲什麼最大工作項目爲我的CPU比CPU和計算單位? 是,意味着CPU性能比GPU更好clinfo設備cpu-gpu信息

CPU:英特爾酷睿i7 2.2GH GPU:的AMD Radeon HD 6700M



Number of platforms:        2 
    Platform Profile:        FULL_PROFILE 
    Platform Version:        OpenCL 1.2 AMD-APP (1084.2) 
    Platform Name:         AMD Accelerated Parallel Proces 
sing 
    Platform Vendor:        Advanced Micro Devices, Inc. 
    Platform Extensions:       cl_khr_icd cl_amd_event_callbac 
k cl_amd_offline_devices cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_me 
dia_sharing 
    Platform Profile:        FULL_PROFILE 
    Platform Version:        OpenCL 1.2 
    Platform Name:         Intel(R) OpenCL 
    Platform Vendor:        Intel(R) Corporation 
    Platform Extensions:       cl_khr_fp64 cl_khr_icd cl_khr_g 
lobal_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32 
_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store 
cl_intel_printf cl_ext_device_fission cl_intel_exec_by_local_thread cl_khr_gl_sh 
aring cl_intel_dx9_media_sharing cl_khr_dx9_media_sharing cl_khr_d3d11_sharing 


    Platform Name:         AMD Accelerated Parallel Proces 
sing 
Number of devices:        2 
    Device Type:         CL_DEVICE_TYPE_GPU 
    Device ID:          4098 
    Max compute units:        6 
    Max work items dimensions:      3 
    Max work items[0]:       256 
    Max work items[1]:       256 
    Max work items[2]:       256 
    Max work group size:       256 
    Preferred vector width char:     16 
    Preferred vector width short:     8 
    Preferred vector width int:     4 
    Preferred vector width long:     2 
    Preferred vector width float:     4 
    Preferred vector width double:     0 
    Native vector width char:      16 
    Native vector width short:      8 
    Native vector width int:      4 
    Native vector width long:      2 
    Native vector width float:      4 
    Native vector width double:     0 
    Max clock frequency:       725Mhz 
    Address bits:         32 
    Max memory allocation:       536870912 
    Image support:         Yes 
    Max number of images read arguments:   128 
    Max number of images write arguments:   8 
    Max image 2D width:       16384 
    Max image 2D height:       16384 
    Max image 3D width:       2048 
    Max image 3D height:       2048 
    Max image 3D depth:       2048 
    Max samplers within kernel:     16 
    Max size of kernel argument:     1024 
    Alignment (bits) of base address:    2048 
    Minimum alignment (bytes) for any datatype: 128 
    Single precision floating point capability 
    Denorms:          No 
    Quiet NaNs:         Yes 
    Round to nearest even:      Yes 
    Round to zero:        Yes 
    Round to +ve and infinity:     Yes 
    IEEE754-2008 fused multiply-add:    Yes 
    Cache type:         None 
    Cache line size:        0 
    Cache size:         0 
    Global memory size:       2147483648 
    Constant buffer size:       65536 
    Max number of constant args:     8 
    Local memory type:        Scratchpad 
    Local memory size:        32768 
    Kernel Preferred work group size multiple:  64 
    Error correction support:      0 
    Unified memory for Host and Device:   0 
    Profiling timer resolution:     1 
    Device endianess:        Little 
    Available:          Yes 
    Compiler available:       Yes 
    Execution capabilities: 
    Execute OpenCL kernels:      Yes 
    Execute native function:      No 
    Queue properties: 
    Out-of-Order:        No 
    Profiling :         Yes 
    Platform ID:         02843864 
    Name:           Turks 
    Vendor:          Advanced Micro Devices, Inc. 
    Driver version:        1084.2 (VM) 
    Profile:          FULL_PROFILE 
    Version:          OpenCL 1.2 AMD-APP (1084.2) 
    Extensions:         cl_khr_global_int32_base_atomic 
s cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_lo 
cal_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store 
cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd 
_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d 
x9_media_sharing 


    Device Type:         CL_DEVICE_TYPE_CPU 
    Device ID:          4098 
    Max compute units:        8 
    Max work items dimensions:      3 
    Max work items[0]:       1024 
    Max work items[1]:       1024 
    Max work items[2]:       1024 
    Max work group size:       1024 
    Preferred vector width char:     16 
    Preferred vector width short:     8 
    Preferred vector width int:     4 
    Preferred vector width long:     2 
    Preferred vector width float:     8 
    Preferred vector width double:     4 
    Native vector width char:      16 
    Native vector width short:      8 
    Native vector width int:      4 
    Native vector width long:      2 
    Native vector width float:      8 
    Native vector width double:     4 
    Max clock frequency:       2195Mhz 
    Address bits:         32 
    Max memory allocation:       1073741824 
    Image support:         Yes 
    Max number of images read arguments:   128 
    Max number of images write arguments:   8 
    Max image 2D width:       8192 
    Max image 2D height:       8192 
    Max image 3D width:       2048 
    Max image 3D height:       2048 
    Max image 3D depth:       2048 
    Max samplers within kernel:     16 
    Max size of kernel argument:     4096 
    Alignment (bits) of base address:    1024 
    Minimum alignment (bytes) for any datatype: 128 
    Single precision floating point capability 
    Denorms:          Yes 
    Quiet NaNs:         Yes 
    Round to nearest even:      Yes 
    Round to zero:        Yes 
    Round to +ve and infinity:     Yes 
    IEEE754-2008 fused multiply-add:    Yes 
    Cache type:         Read/Write 
    Cache line size:        64 
    Cache size:         32768 
    Global memory size:       2147483648 
    Constant buffer size:       65536 
    Max number of constant args:     8 
    Local memory type:        Global 
    Local memory size:        32768 
    Kernel Preferred work group size multiple:  1 
    Error correction support:      0 
    Unified memory for Host and Device:   1 
    Profiling timer resolution:     466 
    Device endianess:        Little 
    Available:          Yes 
    Compiler available:       Yes 
    Execution capabilities: 
    Execute OpenCL kernels:      Yes 
    Execute native function:      Yes 
    Queue properties: 
    Out-of-Order:        No 
    Profiling :         Yes 
    Platform ID:         02843864 
    Name:            Intel(R) Core(TM) i7-2670 
QM CPU @ 2.20GHz 
    Vendor:          GenuineIntel 
    Driver version:        1084.2 (sse2,avx) 
    Profile:          FULL_PROFILE 
    Version:          OpenCL 1.2 AMD-APP (1084.2) 
    Extensions:         cl_khr_fp64 cl_amd_fp64 cl_khr_ 
global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int3 
2_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr 
_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_at 
tribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_khr_d3 
d10_sharing 


    Platform Name:         Intel(R) OpenCL 
Number of devices:        1 
    Device Type:         CL_DEVICE_TYPE_CPU 
    Device ID:          32902 
    Max compute units:        8 
    Max work items dimensions:      3 
    Max work items[0]:       1024 
    Max work items[1]:       1024 
    Max work items[2]:       1024 
    Max work group size:       1024 
    Preferred vector width char:     1 
    Preferred vector width short:     1 
    Preferred vector width int:     1 
    Preferred vector width long:     1 
    Preferred vector width float:     1 
    Preferred vector width double:     1 
    Native vector width char:      16 
    Native vector width short:      8 
    Native vector width int:      4 
    Native vector width long:      2 
    Native vector width float:      8 
    Native vector width double:     4 
    Max clock frequency:       2200Mhz 
    Address bits:         32 
    Max memory allocation:       536838144 
    Image support:         Yes 
    Max number of images read arguments:   480 
    Max number of images write arguments:   480 
    Max image 2D width:       16384 
    Max image 2D height:       16384 
    Max image 3D width:       2048 
    Max image 3D height:       2048 
    Max image 3D depth:       2048 
    Max samplers within kernel:     480 
    Max size of kernel argument:     3840 
    Alignment (bits) of base address:    1024 
    Minimum alignment (bytes) for any datatype: 128 
    Single precision floating point capability 
    Denorms:          Yes 
    Quiet NaNs:         Yes 
    Round to nearest even:      Yes 
    Round to zero:        No 
    Round to +ve and infinity:     No 
    IEEE754-2008 fused multiply-add:    No 
    Cache type:         Read/Write 
    Cache line size:        64 
    Cache size:         262144 
    Global memory size:       2147352576 
    Constant buffer size:       131072 
    Max number of constant args:     480 
    Local memory type:        Global 
    Local memory size:        32768 
    Kernel Preferred work group size multiple:  128 
    Error correction support:      0 
    Unified memory for Host and Device:   1 
    Profiling timer resolution:     466 
    Device endianess:        Little 
    Available:          Yes 
    Compiler available:       Yes 
    Execution capabilities: 
    Execute OpenCL kernels:      Yes 
    Execute native function:      Yes 
    Queue properties: 
    Out-of-Order:        Yes 
    Profiling :         Yes 
    Platform ID:         00401218 
    Name:            Intel(R) Core(TM) i7-2670 
QM CPU @ 2.20GHz 
    Vendor:          Intel(R) Corporation 
    Driver version:        3.0.1.15216 
    Profile:          FULL_PROFILE 
    Version:          OpenCL 1.2 (Build 80752) 
    Extensions:         cl_khr_fp64 cl_khr_icd cl_khr_g 
lobal_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32 
_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store 
cl_intel_printf cl_ext_device_fission cl_intel_exec_by_local_thread cl_khr_gl_sh 
aring cl_intel_dx9_media_sharing cl_khr_dx9_media_sharing cl_khr_d3d11_sharing 

爲什麼看到CPU三種設備類型兩個和一個針對GPU的OpenCL 英特爾的CPU或內置GPU 我有兩個顯示適配器:AMD的Radeon HD 6700M系列 英特爾HD圖形家庭

+0

將其標記爲屬於[su](http://superuser.com/)的線程。 –

+0

什麼意思最大計算單位:6或8。這是否意味着英特爾核心的數量我有核心我7?對於GPU只有6? – user1848223

+0

任何幫助請 – user1848223

回答

2

「有多少核心/處理單元/硬件線程做我的GPU有?「對於新的GPGPU用戶,是一個非常常見的問題。我平常的回答是「你爲什麼在乎?」。沒有辦法查詢設備使用OpenCL API的處理元素的數量。在不同的體系結構中,構成一個處理單元和一個計算單元的確切區別很大。

實際情況是,設備擁有多少處理元素並不重要,因爲使用此指標是評估設備性能的一種非常糟糕的方式。如果您真的需要知道該設備對於特定應用程序的速度有多快,那麼您應該對其進行基準測試(直接與您的應用程序或與您的應用程序具有類似屬性的微型基準測試)。

要回答您的其他問題:您的系統上有兩個OpenCL實現可以使用CPU,Intel和AMD。因此,這兩個平臺都會將CPU報告爲可用的OpenCL設備。

+0

我厭倦了這個問題。但我認爲我們必須長期處理這個問題......事實上,這是一個合乎邏輯的問題。人們仍然會來自CPU世界,並嘗試手動控制每個「線程」並準確知道它們的數量。即使GPU將擁有數百萬的並行內核...... – DarkZeros