如何爲OpenCV多核圖像處理創建TBB任務計劃程序？ C++

我正在學習使用OpenCV和TBB。我需要學習如何使用圖像多處理，因爲我有多核CPU，並且想爲我的程序創建多種處理器支持。如何爲OpenCV多核圖像處理創建TBB任務計劃程序？ C++

我已閱讀Intel®Technology紙雜誌的一篇文章「英特爾®線程構建模塊可擴展的多核軟件的基礎」（你可以在PDF這裏找到它http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.71.8289&rep=rep1&type=pdf）

他們使用fabonacci號計算作爲多處理的一個例子。 TBB包中的TBB示例中也有類似的fabonacci編號示例（請參見ParallelTaskFib）。唯一的問題是計算非常簡單，對於CPU來說沒有太大的負擔，所以當你在少量的CutOff上運行多任務時，效率並不高，因爲它需要太多的開銷。所以要學習使用TBB，我需要更多的圖像處理實例。在我的概念中，我想使用TBB Task Scheduler。我從一個類FibTask開始，並重新命名了ParallelFib，並更改了參數以處理圖像矢量。它的設計基本原則應該保持不變。 fabonacci的例子只包含兩個叫做a和b的孩子。現在的問題是，我不確定是否可以在一個函數matTask中使用兩個以上的孩子（最初稱爲「執行」）。所以我嘗試添加更多的被調用的指針，更多的指針和更多的等待spawn_and_wait_for_all（）...在這個階段，我沒有創建任何圖像處理函數，因爲我想問你這個設計是否正確，如果沒有性能問題。它沒有完成。我會等待你的建議，以解決我的概念中可能出現的錯誤。

我的基本想法是在lena.jpg上使用一些濾鏡函數，如高斯模糊。首先，我會傳遞一些線程。我有8個內核，所以最多隻能傳遞8個線程。我打算將lena圖像分成8個相同大小的圖像，然後將像素複製到矢量（8個基本矢量），然後它們應該變得模糊。然後，另一個階段是我需要創建下一個7-8圖像，這些圖像與8個區域的邊緣重疊。我只想重複bluring動作。最後還需要再傳一次，以獲得可能是圖像其餘部分的區域（來自source_image.rows（）/ 8）的剩餘區域。

我需要解決的主要問題（我不知道該怎麼做）是停止無限循環。我應該創建不同的課程和不同的方法：1）應對和2）模糊3）剪裁4）粘貼？或者我可以在一次通話中通過一切（複製+模糊）？這與fabonnaci數字示例有所不同，因爲代碼做了同樣的事情，但我需要做更多不同的事情......那麼邏輯應該是什麼，如何對事物進行排序，如何命名函數？

更簡單的解決方案是使用8個相同大小的條...然後7-8覆蓋區域。

下面的代碼打印沒有錯誤，但它不是假設返回正確的結果，因爲它只是時間概念。

#include "opencv2/imgproc/imgproc.hpp" 
#include "opencv2/highgui/highgui.hpp" 
#include <iostream> 
#include <stdlib.h> 
#include <stdio.h> 

#include "tbb/task.h" 
#include "tbb/task_scheduler_init.h" 

#define CutOff 12 

using namespace cv; 

void SerialAction(int n){}; 

/** 

**/ 
class matTask: public tbb::task { 
public: 
    int n; 
    const int offset; 
    std::vector<cv::Mat> main_layers; 
    std::vector<cv::Mat> overlay_layers; 

    matTask(std::vector<cv::Mat>main_layers_, std::vector<cv::Mat> overlay_layers_, int n_, const int offset_) : 
     main_layers(main_layers_), 
     overlay_layers(overlay_layers_), 
     n(n_), offset(offset_) 
     {} 

     task* execute() { 
     if(n<CutOff) { 
      SerialAction(n); 
      } 
     else { 
      // Main layers - copy regions 
      matTask& a = *new(allocate_child()) 
       matTask(main_layers,overlay_layers,n,0); 
      matTask& b = *new(allocate_child()) 
       matTask(main_layers,overlay_layers,n-1,0); 
      matTask& c = *new(allocate_child()) 
       matTask(main_layers,overlay_layers,n-2,0); 
      matTask& d = *new(allocate_child()) 
       matTask(main_layers,overlay_layers,n-3,0); 
      matTask& e = *new(allocate_child()) 
       matTask(main_layers,overlay_layers,n-4,0); 
      matTask& f = *new(allocate_child()) 
       matTask(main_layers,overlay_layers,n-5,0); 
      matTask& g = *new(allocate_child()) 
       matTask(main_layers,overlay_layers,n-6,0); 
      matTask& h = *new(allocate_child()) 
       matTask(main_layers,overlay_layers,n-7,0); 

      spawn_and_wait_for_all(a); 
      spawn_and_wait_for_all(b); 
      spawn_and_wait_for_all(c); 
      spawn_and_wait_for_all(d); 
      spawn_and_wait_for_all(e); 
      spawn_and_wait_for_all(f); 
      spawn_and_wait_for_all(g); 
      spawn_and_wait_for_all(h); 
      // In the case of effect: 
      // Overlay layers 

      matTask& ab = *new(allocate_child()) 
       matTask(main_layers,overlay_layers,n,offset); 
      matTask& bc = *new(allocate_child()) 
       matTask(main_layers,overlay_layers,n-1,offset); 
      matTask& cd = *new(allocate_child()) 
       matTask(main_layers,overlay_layers,n-2,offset); 
      matTask& de = *new(allocate_child()) 
       matTask(main_layers,overlay_layers,n-2,offset); 
      matTask& ef = *new(allocate_child()) 
       matTask(main_layers,overlay_layers,n-2,offset); 
      matTask& gh = *new(allocate_child()) 
       matTask(main_layers,overlay_layers,n-2,offset); 

      // ... + crop .. depends on size of kernel 

      set_ref_count(8); 
      spawn(b); 
      spawn_and_wait_for_all(a); 
     } 
    return NULL; 
    } 
}; 
void ParallelAction(std::vector<cv::Mat> main, std::vector<cv::Mat> overlays, int n, const int offset) { 
    matTask& a = *new(tbb::task::allocate_root()) 
    matTask(main, overlays, n,offset); 
    tbb::task::spawn_root_and_wait(a); 
} 

int main(int argc, char** argv) 
{  
    int threads = 8; 

    std::vector<cv::Mat> main_layers; 
    std::vector<cv::Mat> overlays; 

    cv:: Mat sourceImg; 
    sourceImg = imread("../../data/lena.jpg"); 
    if (sourceImg.empty()) 
     return -1; 

    const int offset = (int) sourceImg.rows/threads; 


    cv::setNumThreads(0); 
    ParallelAction(main_layers, overlays, threads, offset); 

    // GaussianBlur(src, dst, Size(3,3), 0, 0, BORDER_DEFAULT); 

    return 0; 
}

編輯：反應到安東的答案。如果我使用operator（）重載，那麼運算符（）應用的確切時間？也有可能爲ApplyFoo添加一些方法嗎？ WW（）是重載的，似乎只有一種方法。

void Foo(float a){}; 

class ApplyFoo { 
    float *const my_a; 
public: 
    void operator()(const tbb::blocked_range<size_t>& r) const { 
     float *a = my_a; 
     for(size_t i=r.begin(); i!=r.end(); ++i) 
      Foo(a[i]); 
    } 
    ApplyFoo(float a[]) : 
     my_a(a) // initiate my_a 
    {} 
};

來源

2016-06-30 John Boe

您可以繼承OpenCV的'ParallelLoopBody'並使用'cv：parallel_for_'，它將使用tbb（如果可用）。你可以看到一個例子[這裏]（http://stackoverflow.com/a/34315148/5008845）進行灰度轉換，你也許可以適應你的需求 – Miki

謝謝，我會嘗試一下例子alter。現在我正在嘗試與TBB合作。 –

您指出的文章是從2007年起！它非常過時（儘管仍然相關，因爲TBB保持了所有的源代碼兼容性）。 tbb::task接口被認爲是低級別的，並不便於應用程序開發。請致電refer totbb::parallel_for,tbb::parallel_invoke，特別是tbb::task_group，它直接支持取消。

來源

2016-06-30 16:31:34 Anton

我更新了一個問題的代碼。如果我在調用時使用operator（）重載？是否有可能在ApplyFoo類中使用一些額外的函數？是否有可能將Foo函數移動到ApplyFoo類，否則由於運算符（）會導致衝突？作爲一種方法，我可以將Foo作爲一種不同的課程嗎？在那種情況下，我如何將引用傳遞給Foo所在的對象？在哪裏放置指針？我認爲唯一的解決方法是將它放入類ApplyFoo中，放入構造函數中，以便我可以從operator（）函數訪問它。 –

@JohnBoe你可以用'operator（）'定義任何類，或者你可以使用lambda表達式，如'[]（const tbb :: blocked_range ＆r）{do_it_inline（）; }'。它在'parallel_for'完成之前或'parallel_invoke'或'task_group.wait（）'完成之前調用。 – Anton

那麼當我不想將這些功能保留在課外時，是否有必要創建兩個類？我創建了一個BlurAction_wrapper類，它應該創建所有的動作：BlurAction_wrapper * BlurAction = new BlurAction_wrapper（sourceImg，＆targetImg，iArgs）;'像Copy，Blur，Crop，Merge。我應該將公共部分從ApplyFoo移動到BlurAction類（重命名它）還是將它們分開？ –

如何爲OpenCV多核圖像處理創建TBB任務計劃程序？ C++

回答

相關問題