2
我想構建一個內核來執行並行字符串搜索。爲此我傾向於使用有限狀態機。 fsm的轉換表處於內核參數狀態。代碼:OpenCL內核沒有矢量化
__kernel void Find (__constant char *text,
const int offset,
const int tlenght,
__constant char *characters,
const int clength,
const int maxlength,
__constant int *states,
const int statesdim){
private char c;
private int state;
private const int id = get_global_id(0);
if (id<(tlenght-maxlength)) {
private int cIndex,sd,s,k;
for (int i=0; i<maxlength; i++) {
c = text[i+offset];
cIndex = -1;
for (int j=0; j<clength; j++) {
if (characters[j]==c) {
cIndex = j;
}
}
if (cIndex==-1) {
state = 0;
break;
} else {
s = states[state+cIndex*statesdim];
}
if (state<=0) break;
}
}
}
如果我使用iocgui編譯這個內核,我得到的結果是:
Using default instruction set architecture.
Intel OpenCL CPU device was found!
Device name: Pentium(R) Dual-Core CPU T4400 @ 2.20GHz
Device version: OpenCL 1.1 (Build 31360.31426)
Device vendor: Intel(R) Corporation
Device profile: FULL_PROFILE
Build started
Kernel <Find> was successfully vectorized
Done.
Build succeeded!
當我改變,其中新的狀態被確定爲線:
state = states[state+cIndex*statesdim];
結果是:
Using default instruction set architecture.
Intel OpenCL CPU device was found!
Device name: Pentium(R) Dual-Core CPU T4400 @ 2.20GHz
Device version: OpenCL 1.1 (Build 31360.31426)
Device vendor: Intel(R) Corporation
Device profile: FULL_PROFILE
Build started
Kernel <Find> was not vectorized
Done.
Build succeeded!