我也展開了搜索的版本,
https://github.com/nmmmnu/beating_binsearch/blob/master/improved_unroll.h
這裏是有問題的代碼:
char search(uint64_t const start1, uint64_t const end1, const T *data, const T key, uint64_t &index) const{
/*
* Lazy based from Linux kernel...
* http://lxr.free-electrons.com/source/lib/bsearch.c
*/
uint64_t start = start1;
uint64_t end = end1;
char cmp = 0;
//while (start < end){
while (start < end){
// uint64_t mid = start + ((end - start)/2);
uint64_t mid = start + ((end - start) >> 1);
//char cmp = _linear(mid - CACHE_COUNT_2, mid + CACHE_COUNT_2, data, key, mid);
#define _LINE_HALF_SIZE 7
#define _LINE(i) \
if (i >= end){ \
start = mid + _LINE_HALF_SIZE + 1; \
continue; \
} \
\
cmp = comp.cmp(data[i], key); \
\
if (cmp == 0){ \
index = i; \
return 0; \
} \
\
if (cmp > 0){ \
end = i + 1; \
continue; \
}
_LINE(mid - 7);
_LINE(mid - 6);
_LINE(mid - 5);
_LINE(mid - 4);
_LINE(mid - 3);
_LINE(mid - 2);
_LINE(mid - 1);
_LINE(mid );
_LINE(mid + 1);
_LINE(mid + 2);
_LINE(mid + 3);
_LINE(mid + 4);
_LINE(mid + 5);
_LINE(mid + 6);
_LINE(mid + 7);
#undef _LINE
start = mid + _LINE_HALF_SIZE + 1;
}
index = start;
return cmp;
}
似乎還有是太多的分支預測錯誤,因爲如果我刪除以下if
聲明:
if (i >= end){ \
start = mid + _LINE_HALF_SIZE + 1; \
continue; \
} \
速度「神奇地」變得等於或大於classical binary search
甚至更好 - 當然,因爲我消除了分支的算法並沒有真正工作正常,但是這清楚地表明瞭爲什麼算法比慢classical binary search
。
我想你可能會在http://codereview.stackexchange.com/ – user2079303
@ user2079303中得到更好的答案,但這個問題必須先完全重新完成。關於CodeReview的問題需要** a)**已經按照預期工作,並且** b)**將代碼嵌入到問題中。此外,功能請求和代碼解釋也是無關緊要的。 – Kaz
我想過了,但一般來說一切都只在'improved.h'文件中。其他文件是測量,並不重要,他們如何編寫。 – Nick