我試圖矢量化對象檢測的滑動窗口搜索。到目前爲止,我已經能夠使用numpy廣播將我的主圖像切片成窗口大小的切片,這些切片存儲在下面所示的變量'all_windows'中。我已經驗證了實際值匹配,所以我很滿意這一點。如何調用向量化滑動窗口的切片上的函數?
下一部分是我遇到麻煩的地方。我想索引到'all_windows'數組,因爲我調用了patchCleanNPredict()函數,以便我可以以相似的矢量化格式將每個窗口傳遞到函數中。
我試圖創建一個名爲new_indx的數組,該數組將包含2d數組中的切片索引,例如([0,0],[1,0],[2,0] ...),但已經遇到問題。
我希望最終得到每個窗口的置信度值數組。下面的代碼在python 3.5中工作。預先感謝任何幫助/建議。
import numpy as np
def patchCleanNPredict(patch):
# patch = cv2.resize()# shrink patches with opencv resize function
patch = np.resize(patch.flatten(),(1,np.shape(patch.flatten())[0])) # flatten the patch
print('patch: ',patch.shape)
# confidence = predict(patch) # fake function showing prediction intent
return # confidence
window = (30,46)# window dimensions
strideY = 10
strideX = 10
img = np.random.randint(0,245,(640,480)) # image that is being sliced by the windows
indx = np.arange(0,img.shape[0]-window[1],strideY)[:,None]+np.arange(window[1])
vertical_windows = img[indx]
print(vertical_windows.shape) # returns (60,46,480)
vertical_windows = np.transpose(vertical_windows,(0,2,1))
indx = np.arange(0,vertical_windows.shape[1]-window[0],strideX)[:,None]+np.arange(window[0])
all_windows = vertical_windows[0:vertical_windows.shape[0],indx]
all_windows = np.transpose(all_windows,(1,0,3,2))
print(all_windows.shape) # returns (45,60,46,30)
data_patch_size = (int(window[0]/2),int(window[1]/2)) # size the windows will be shrunk to
single_patch = all_windows[0,0,:,:]
patchCleanNPredict(single_patch) # prints the flattened patch size (1,1380)
new_indx = (1,1) # should this be an array of indices?
patchCleanNPredict(all_windows[new_indx,:,:]) ## this is where I'm having trouble