2017-10-13 64 views
0

我正嘗試在OpenCV中使用MobileNet SSD +深度神經網絡(dnn)模塊進行對象檢測。我成功加載並使用了模型。作爲net.forward的輸出,我獲得了包含關於檢測到的對象的信息的Mat對象。不幸的是,我在與「易於工作的一部分」鬥爭,閱讀究竟是什麼被發現。如何在高維Mat:Class中使用C++讀取特定座標處的數據?

這是我知道的輸出墊目標信息:

  • 它有4種尺寸。
  • 的大小是1×1×number_of_objects_detected X 7.
  • 七條關於每個對象的信息的有:第一是類ID,所述第二是信心,所述第三-第七是邊界框的值。

我找不到任何C++的例子,但我發現了很多python的例子。他們讀取這樣的數據:

for i in np.arange(0, detections.shape[2]):  
    confidence = detections[0, 0, i, 2] 

如何在C++中做到這一點最簡單的方法是什麼?即我需要讀取高維Mat:class中特定座標的數據。

謝謝你的幫助。我在C++中很新,有時會發現它壓倒性的...

我正在使用OpenCV 3.3.0。我正在使用MobileNet SSD的GitHub:https://github.com/chuanqi305/MobileNet-SSD

我的程序的代碼:

#include <opencv2/dnn.hpp> 
#include <opencv2/imgproc.hpp> 
#include <opencv2/highgui.hpp> 

#include <fstream> 
#include <iostream> 

using namespace cv; 
using namespace cv::dnn; 

using namespace std; 

// function to create vector of class names 
std::vector<String> createClaseNames() { 
    std::vector<String> classNames; 
    classNames.push_back("background"); 
    classNames.push_back("aeroplane"); 
    classNames.push_back("bicycle"); 
    classNames.push_back("bird"); 
    classNames.push_back("boat"); 
    classNames.push_back("bottle"); 
    classNames.push_back("bus"); 
    classNames.push_back("car"); 
    classNames.push_back("cat"); 
    classNames.push_back("chair"); 
    classNames.push_back("cow"); 
    classNames.push_back("diningtable"); 
    classNames.push_back("dog"); 
    classNames.push_back("horse"); 
    classNames.push_back("motorbike"); 
    classNames.push_back("person"); 
    classNames.push_back("pottedplant"); 
    classNames.push_back("sheep"); 
    classNames.push_back("sofa"); 
    classNames.push_back("train"); 
    classNames.push_back("tvmonitor"); 
    return classNames; 
} 

// main function 
int main(int argc, char **argv) 
{ 
    // set inputs 
    String modelTxt = "C:/Users/acer/Desktop/kurz_OCV/cv4faces/project/python/object-detection-deep-learning/MobileNetSSD_deploy.prototxt"; 
    String modelBin = "C:/Users/acer/Desktop/kurz_OCV/cv4faces/project/python/object-detection-deep-learning/MobileNetSSD_deploy.caffemodel"; 
    String imageFile = "C:/Users/acer/Desktop/kurz_OCV/cv4faces/project/puppies.jpg"; 
    std::vector<String> classNames = createClaseNames(); 

    //read caffe model 
    Net net; 
    try { 
     net = dnn::readNetFromCaffe(modelTxt, modelBin); 
    } 
    catch (cv::Exception& e) { 
     std::cerr << "Exception: " << e.what() << std::endl; 
     if (net.empty()) 
     { 
      std::cerr << "Can't load network." << std::endl; 
      exit(-1); 
     } 
    } 

    // read image 
    Mat img = imread(imageFile); 

    // create input blob 
    resize(img, img, Size(300, 300)); 
    Mat inputBlob = blobFromImage(img, 0.007843, Size(300, 300), Scalar(127.5)); //Convert Mat to dnn::Blob image batch 

    // apply the blob on the input layer 
    net.setInput(inputBlob); //set the network input 

    // classify the image by applying the blob on the net 
    Mat detections = net.forward("detection_out"); //compute output 

    // print some information about detections 
    std::cout << "dims: " << detections.dims << endl; 
    std::cout << "size: " << detections.size << endl; 

    //show image 
    String winName("image"); 
    imshow(winName, img); 

    // Wait for keypress 
    waitKey(); 

} 
+0

有一個C++樣品與MobileNet-SSD從來自Caffe:https://github.com/opencv/opencv/blob/master/samples/dnn/ssd_mobilenet_object_detection.cpp –

回答

0

檢查出how to scan images官方OpenCV的教程。

你訪問將使用3信道(即顏色)Mat方式Mat類,它在很大程度上重載各種訪問選項的Mat::at()方法的正常方式。具體而言,您可以發送array of indicesvector of indices


這裏有一個最簡單的例子創建一個4D Mat和訪問特定的元素:

#include <opencv2/opencv.hpp> 
#include <iostream> 

int main() { 
    int size[4] = { 2, 2, 5, 7 }; 
    cv::Mat M(4, size, CV_32FC1, cv::Scalar(1)); 
    int indx[4] = { 0, 0, 2, 3 }; 
    std::cout << "M[0, 0, 2, 3] = " << M.at<float>(indx) << std::endl; 
} 
M[0, 0, 2, 3] = 1 
+0

感謝建議。事實上,我已經試過Mat :: at() - 但它似乎最多支持三維空間(https://docs.opencv.org/3.3.0/d3/d63/classcv_1_1Mat.html #a305829ed5c0ecfef7b44db18953048e8)。 具體而言,我只是無法做到這一點: 'int class = detections。在(0,0,0,1);' – Betty

+0

謝謝你的例子,知道我明白了。 – Betty

+0

@Beth是啊,如果你看看在文檔中鏈接的at()的所有重載定義,你會發現最多有3個參數用於3D訪問;否則您需要使用數組或向量來索引N-D矩陣。如果這回答你的問題,你介意接受它嗎?否則讓我知道它是否應該擴大。 –

0

有人可能會發現在使用MobileNet SSD的背景下,這個問題+ OpenCV中的深度神經網絡(dnn)模塊用於對象檢測。所以在這裏我發佈了已經實現的對象檢測代碼。亞歷山大雷諾茲感謝你的幫助。

#include <opencv2/dnn.hpp> 
#include <opencv2/imgproc.hpp> 
#include <opencv2/highgui.hpp> 

#include <fstream> 
#include <iostream> 

using namespace cv; 
using namespace cv::dnn; 
using namespace std; 

// function to create vector of class names 
std::vector<String> createClaseNames() { 
    std::vector<String> classNames; 
    classNames.push_back("background"); 
    classNames.push_back("aeroplane"); 
    classNames.push_back("bicycle"); 
    classNames.push_back("bird"); 
    classNames.push_back("boat"); 
    classNames.push_back("bottle"); 
    classNames.push_back("bus"); 
    classNames.push_back("car"); 
    classNames.push_back("cat"); 
    classNames.push_back("chair"); 
    classNames.push_back("cow"); 
    classNames.push_back("diningtable"); 
    classNames.push_back("dog"); 
    classNames.push_back("horse"); 
    classNames.push_back("motorbike"); 
    classNames.push_back("person"); 
    classNames.push_back("pottedplant"); 
    classNames.push_back("sheep"); 
    classNames.push_back("sofa"); 
    classNames.push_back("train"); 
    classNames.push_back("tvmonitor"); 
    return classNames; 
} 

// main function 
int main(int argc, char **argv) 
{ 
    // set inputs 
    String modelTxt = "Path to MobileNetSSD_deploy.prototxt"; 
    String modelBin = "Path to MobileNetSSD_deploy.caffemodel"; 
    String imageFile = "Path to test image"; 
    std::vector<String> classNames = createClaseNames(); 

    //read caffe model 
    Net net; 
    try { 
     net = dnn::readNetFromCaffe(modelTxt, modelBin); 
    } 
    catch (cv::Exception& e) { 
     std::cerr << "Exception: " << e.what() << std::endl; 
     if (net.empty()) 
     { 
      std::cerr << "Can't load network." << std::endl; 
      exit(-1); 
     } 
    } 

    // read image 
    Mat img = imread(imageFile); 
    Size imgSize = img.size(); 

    // create input blob 
    Mat img300; 
    resize(img, img300, Size(300, 300)); 
    Mat inputBlob = blobFromImage(img300, 0.007843, Size(300, 300), Scalar(127.5)); //Convert Mat to dnn::Blob image batch 

    // apply the blob on the input layer 
    net.setInput(inputBlob); //set the network input 

    // classify the image by applying the blob on the net 
    Mat detections = net.forward("detection_out"); //compute output 

    // look what the detector found 
    for (int i=0; i < detections.size[2]; i++) { 

     // print information into console 
     cout << "-----------------" << endl; 
     cout << "Object nr. " << i + 1 << endl; 

     // detected class 
     int indxCls[4] = { 0, 0, i, 1 }; 
     int cls = detections.at<float>(indxCls); 
     std::cout << "class: " << classNames[cls] << endl; 

     // confidence 
     int indxCnf[4] = { 0, 0, i, 2 }; 
     float cnf = detections.at<float>(indxCnf); 
     std::cout << "confidence: " << cnf * 100 << "%" << endl; 

     // bounding box 
     int indxBx[4] = { 0, 0, i, 3 }; 
     int indxBy[4] = { 0, 0, i, 4 }; 
     int indxBw[4] = { 0, 0, i, 5 }; 
     int indxBh[4] = { 0, 0, i, 6 }; 
     int Bx = detections.at<float>(indxBx) * imgSize.width; 
     int By = detections.at<float>(indxBy) * imgSize.height; 
     int Bw = detections.at<float>(indxBw) * imgSize.width - Bx; 
     int Bh = detections.at<float>(indxBh) * imgSize.height - By; 
     std::cout << "bounding box [x, y, w, h]: " << Bx << ", " << By << ", " << Bw << ", " << Bh << endl; 

     // draw bounding box to image 
     Rect bbox(Bx, By, Bw, Bh); 
     rectangle(img, bbox, Scalar(255,0,255),1,8,0); 

    } 
    //show image 
    String winName("image"); 
    imshow(winName, img); 

    // Wait for keypress 
    waitKey(); 

} 
相關問題