的strided range interator基本上是一個精心設計排列迭代器一個仿函數給出相應的指數排列。
這裏是對跨步迭代器示例的修改。主要的改變是:
- 包括
chunk
大小作爲一個迭代參數
- 修改,提供了指數對置換迭代吐出所需序列
- 調整
.end()
迭代器的定義,以提供仿序列的適當長度。
樣例:
$ cat t1280.cu
#include <thrust/iterator/counting_iterator.h>
#include <thrust/iterator/transform_iterator.h>
#include <thrust/iterator/permutation_iterator.h>
#include <thrust/functional.h>
#include <thrust/fill.h>
#include <thrust/device_vector.h>
#include <thrust/copy.h>
#include <thrust/sequence.h>
#include <iostream>
#include <assert.h>
// this example illustrates how to make strided-chunk access to a range of values
// examples:
// strided_chunk_range([0, 1, 2, 3, 4, 5, 6], 1,1) -> [0, 1, 2, 3, 4, 5, 6]
// strided_chunk_range([0, 1, 2, 3, 4, 5, 6], 2,1) -> [0, 2, 4, 6]
// strided_chunk_range([0, 1, 2, 3, 4, 5, 6], 3,2) -> [0 ,1, 3, 4, 6]
// ...
template <typename Iterator>
class strided_chunk_range
{
public:
typedef typename thrust::iterator_difference<Iterator>::type difference_type;
struct stride_functor : public thrust::unary_function<difference_type,difference_type>
{
difference_type stride;
int chunk;
stride_functor(difference_type stride, int chunk)
: stride(stride), chunk(chunk) {}
__host__ __device__
difference_type operator()(const difference_type& i) const
{
int pos = i/chunk;
return ((pos * stride) + (i-(pos*chunk)));
}
};
typedef typename thrust::counting_iterator<difference_type> CountingIterator;
typedef typename thrust::transform_iterator<stride_functor, CountingIterator> TransformIterator;
typedef typename thrust::permutation_iterator<Iterator,TransformIterator> PermutationIterator;
// type of the strided_range iterator
typedef PermutationIterator iterator;
// construct strided_range for the range [first,last)
strided_chunk_range(Iterator first, Iterator last, difference_type stride, int chunk)
: first(first), last(last), stride(stride), chunk(chunk) {assert(chunk<=stride);}
iterator begin(void) const
{
return PermutationIterator(first, TransformIterator(CountingIterator(0), stride_functor(stride, chunk)));
}
iterator end(void) const
{
int lmf = last-first;
int nfs = lmf/stride;
int rem = lmf-(nfs*stride);
return begin() + (nfs*chunk) + ((rem<chunk)?rem:chunk);
}
protected:
Iterator first;
Iterator last;
difference_type stride;
int chunk;
};
int main(void)
{
thrust::device_vector<int> data(50);
thrust::sequence(data.begin(), data.end());
typedef thrust::device_vector<int>::iterator Iterator;
// create strided_chunk_range
std::cout << "stride 3, chunk 2, length 7" << std::endl;
strided_chunk_range<Iterator> scr1(data.begin(), data.begin()+7, 3, 2);
thrust::copy(scr1.begin(), scr1.end(), std::ostream_iterator<int>(std::cout, " ")); std::cout << std::endl;
std::cout << "stride 8, chunk 3, length 50" << std::endl;
strided_chunk_range<Iterator> scr(data.begin(), data.end(), 8, 3);
thrust::copy(scr.begin(), scr.end(), std::ostream_iterator<int>(std::cout, " ")); std::cout << std::endl;
return 0;
}
$ nvcc -arch=sm_35 -o t1280 t1280.cu
$ ./t1280
stride 3, chunk 2, length 7
0 1 3 4 6
stride 8, chunk 3, length 50
0 1 2 8 9 10 16 17 18 24 25 26 32 33 34 40 41 42 48 49
$
這可能不是最優化的實施,特別是因爲我們在置換函子做除法,但它應該讓你開始。
我假設(並測試)chunk<=stride
,因爲這對我來說似乎很合理,並簡化了我的思維過程。我相信它可以被修改,並有一個合適的例子,你可以看到什麼樣的序列,例如chunk>stride
。
是的! chunk <= stride,現在我將在您的示例中儘可能優化它。謝謝! – Fernando23
@ Fernando23你能分享你的解決方案嗎? – Olumide