caffe像素級分類/迴歸

我想要做的是做一個簡單的像素級分類或迴歸任務。因此我有一個輸入圖像和一個ground_truth。我想要做的是做一個簡單的分割任務，我有一個圓形和一個矩形。我想訓練，在那裏的圓或矩形的地方。這意味着我有一個ground_truth圖像，其值爲「1」，在圓的所有位置，值爲「2」的矩形所在的所有位置。然後，我的圖像和ground_truth圖像以.png圖像的形式輸入。caffe像素級分類/迴歸

那麼我想我可以取決於我的損失層上的迴歸或分類任務：我一直在使用完全卷積AlexNet從fcn alexnet

分類：

layer { 
    name: "upscore" 
    type: "Deconvolution" 
    bottom: "score_fr" 
    top: "upscore" 
    param { 
    lr_mult: 0 
    } 
    convolution_param { 
    num_output: 3 ## <<---- 0 = backgrund 1 = circle 2 = rectangle 
    bias_term: false 
    kernel_size: 63 
    stride: 32 
    } 
} 
layer { 
    name: "score" 
    type: "Crop" 
    bottom: "upscore" 
    bottom: "data" 
    top: "score" 
    crop_param { 
    axis: 2 
    offset: 18 
    } 
} 
layer { 
    name: "loss" 
    type: "SoftmaxWithLoss" ## <<---- 
    bottom: "score" 
    bottom: "ground_truth" 
    top: "loss" 
    loss_param { 
    ignore_label: 0 
    } 
}

迴歸：

layer { 
    name: "upscore" 
    type: "Deconvolution" 
    bottom: "score_fr" 
    top: "upscore" 
    param { 
    lr_mult: 0 
    } 
    convolution_param { 
    num_output: 1 ## <<---- 1 x height x width 
    bias_term: false 
    kernel_size: 63 
    stride: 32 
    } 
} 
layer { 
    name: "score" 
    type: "Crop" 
    bottom: "upscore" 
    bottom: "data" 
    top: "score" 
    crop_param { 
    axis: 2 
    offset: 18 
    } 
} 
layer { 
    name: "loss" 
    type: "EuclideanLoss" ## <<---- 
    bottom: "score" 
    bottom: "ground_truth" 
    top: "loss" 
}

但是，這甚至不會產生我想要的結果。我認爲我對像素分類/迴歸的理解有些問題。你能告訴我我的錯誤在哪裏嗎？

EDIT 1

對於迴歸輸出的檢索是這樣的：

output_blob = pred['result'].data 

predicated_image_array = np.array(output_blob) 
predicated_image_array = predicated_image_array.squeeze() 
print predicated_image_array.shape 
#print predicated_image_array.shape 

#print mean_array 
range_value = np.ptp(predicated_image_array) 
min_value = predicated_image_array.min() 
max_value = predicated_image_array.max() 

# make positive 
predicated_image_array[:] -= min_value 

if not range_value == 0: 
    predicated_image_array /= range_value 

predicated_image_array *= 255 
predicated_image_array = predicated_image_array.astype(np.int64) 
print predicated_image_array.shape 

cv2.imwrite('predicted_output.jpg', predicated_image_array)

這是容易的，因爲輸出是1×高度×寬度和的值是實際的輸出值。但是，由於輸出是3（數字標籤）x高度x寬度，因此如何檢索分類/ SotMaxLayer的輸出。但我不知道這個形狀的內容的含義。

來源

2016-11-11 thigi

首先，你的問題不是regression，而是classification！

如果您想教網識別圓和矩形，您必須製作不同的數據集 - 圖像和標籤，例如：circle - 0 and rectangle - 1。您可以通過製作包含圖像路徑和圖像標籤的文本文件來實現，例如：/path/circle1.png 0 /path/circle2.png 0 /path/rectangle1.png 1 /path/rectangle1.png 1。對於像你這樣的問題，這裏有個不錯的tutorial。祝你好運。

來源

2016-11-11 15:32:09

不，不！我想做一個像素明智的分類或迴歸任務。只是一個簡單的。我知道你可以用普通的分類任務來做到這一點，但我想要做像素分類。我的想法是有一個分割/分類。你懂我的意思嗎？ – thigi

你也可以想到：我在一個圖像中有圓形和矩形，我想要分割！我只想做一個非常簡單的基於像素的預測任務，因爲我知道如何進行標準分類，但我希望稍後能夠像素級預測一樣，並執行更多難度的任務！ – thigi

所以你可以通過'segmentation'來完成。對於每個圖像（圓形/矩形），您必須製作GT圖像（0-255）。每個圖像中的類數是2（背景和幾何形狀），所以'num_output：2'。設置背景爲0和形狀1.祝你好運。 –

caffe像素級分類/迴歸

回答

相關問題