我試圖通過使用模板匹配將2個圖像縫合在一起查找3組點，我將其傳遞到cv2.getAffineTransform()獲取扭曲矩陣，我將它傳遞到cv2.warpAffine()以對齊我的圖像。使用warpAffine一起顯示拼接圖像

但是，當我加入我的圖像時，大部分仿射圖像都未顯示。我嘗試過使用不同的技術來選擇點，改變順序或參數等，但我只能得到一個很窄的仿射圖像滑動顯示。

有人可以告訴我，我的方法是否有效，並建議我可能會犯的錯誤？任何猜測什麼可能導致問題將不勝感激。提前致謝。

這是我得到的final result。這裏是原始圖像（1，2）和我使用的代碼：

編輯：這裏的變量trans

array([[ 1.00768049e+00, -3.76690353e-17, -3.13824885e+00], 
     [ 4.84461775e-03, 1.30769231e+00, 9.61912797e+02]])

這裏的結果是在這裏傳遞給cv2.getAffineTransform點：unified_pair1

array([[ 671., 1024.], 
     [ 15., 979.], 
     [ 15., 962.]], dtype=float32)

unified_pair2

array([[ 669., 45.], 
     [ 18., 13.], 
     [ 18., 0.]], dtype=float32)

個

import cv2 
import numpy as np 


def showimage(image, name="No name given"): 
    cv2.imshow(name, image) 
    cv2.waitKey(0) 
    cv2.destroyAllWindows() 
    return 

image_a = cv2.imread('image_a.png') 
image_b = cv2.imread('image_b.png') 


def get_roi(image): 
    roi = cv2.selectROI(image) # spacebar to confirm selection 
    cv2.waitKey(0) 
    cv2.destroyAllWindows() 
    crop = image_a[int(roi[1]):int(roi[1]+roi[3]), int(roi[0]):int(roi[0]+roi[2])] 
    return crop 
temp_1 = get_roi(image_a) 
temp_2 = get_roi(image_a) 
temp_3 = get_roi(image_a) 

def find_template(template, search_image_a, search_image_b): 
    ccnorm_im_a = cv2.matchTemplate(search_image_a, template, cv2.TM_CCORR_NORMED) 
    template_loc_a = np.where(ccnorm_im_a == ccnorm_im_a.max()) 

    ccnorm_im_b = cv2.matchTemplate(search_image_b, template, cv2.TM_CCORR_NORMED) 
    template_loc_b = np.where(ccnorm_im_b == ccnorm_im_b.max()) 
    return template_loc_a, template_loc_b 


coord_a1, coord_b1 = find_template(temp_1, image_a, image_b) 
coord_a2, coord_b2 = find_template(temp_2, image_a, image_b) 
coord_a3, coord_b3 = find_template(temp_3, image_a, image_b) 

def unnest_list(coords_list): 
    coords_list = [a[0] for a in coords_list] 
    return coords_list 

coord_a1 = unnest_list(coord_a1) 
coord_b1 = unnest_list(coord_b1) 
coord_a2 = unnest_list(coord_a2) 
coord_b2 = unnest_list(coord_b2) 
coord_a3 = unnest_list(coord_a3) 
coord_b3 = unnest_list(coord_b3) 

def unify_coords(coords1,coords2,coords3): 
    unified = [] 
    unified.extend([coords1, coords2, coords3]) 
    return unified 

# Create a 2 lists containing 3 pairs of coordinates 
unified_pair1 = unify_coords(coord_a1, coord_a2, coord_a3) 
unified_pair2 = unify_coords(coord_b1, coord_b2, coord_b3) 

# Convert elements of lists to numpy arrays with data type float32 
unified_pair1 = np.asarray(unified_pair1, dtype=np.float32) 
unified_pair2 = np.asarray(unified_pair2, dtype=np.float32) 

# Get result of the affine transformation 
trans = cv2.getAffineTransform(unified_pair1, unified_pair2) 

# Apply the affine transformation to original image 
result = cv2.warpAffine(image_a, trans, (image_a.shape[1] + image_b.shape[1], image_a.shape[0])) 
result[0:image_b.shape[0], image_b.shape[1]:] = image_b 

showimage(result) 
cv2.imwrite('result.png', result)

來源：法根據收到的意見here，這tutorial這example從文檔。

來源

2017-06-09 Bprodz

偉大的工作從最後一個問題實施解決方案！對於一個建議，所以人們不必篩選所有的代碼，你可能需要包含一個你創建的轉換，並且只顯示試圖獲取該轉換的代碼，以使你獲得正確的座標。雖然我不知道這是否有必要，因爲我很快就能夠得到答案。主要的問題是，教程假定你從左到右 - 它實際上並不知道有多少像素需要移位。不過，你可以真正計算這個。 –

@AlexanderReynolds感謝您的評論，我已經添加了用於生成warpMatrix的變量與原始問題以及變換矩陣「trans」的關係。我假設（也許不正確）由'cv2.getAffineTransformation'生成的變換矩陣也會移動圖像。你知道我該怎麼去計算這個轉變嗎？ – Bprodz

是的，我願意。現在寫一個答案。 :) –

7月12日編輯：

這篇文章啓發了GitHub庫提供的功能來完成這項任務;一個用於填充warpAffine()，另一個用於填充warpPerspective()。分叉Python version或C++ version。

轉換轉向像素

什麼任何改造所做的是把你的點座標(x, y)並將它們映射到新的位置(x', y')的位置：

s*x' h1 h2 h3  x 
s*y' = h4 h5 h6 * y 
s  h7 h8 1  1

其中s一些縮放因子。您必須按比例因子除以新的座標以找回正確的像素位置(x', y')。從技術上講，這僅適用於單形 - (3, 3)變換矩陣---您不需要爲仿射變換進行縮放（您甚至不需要使用齊次座標......但最好保持這種討論的一般性）。

然後將實際像素值移動到這些新位置，並對顏色值進行插值以適應新的像素網格。所以在這個過程中，這些新的地點會在某個時候被記錄下來。我們需要這些位置來查看像素實際移動到的位置，相對於其他圖像。讓我們從一個簡單的例子開始，看看點在哪裏映射。

假設您的轉換矩陣只是將像素左移十個像素。翻譯由最後一欄處理;第一行是x中的翻譯，第二行是y中的翻譯。所以我們會有一個單位矩陣，但在第一行第三列中有-10。像素(0,0)將映射到哪裏？希望(-10,0)如果邏輯有道理。實際上，它的確如此：

transf = np.array([[1.,0.,-10.],[0.,1.,0.],[0.,0.,1.]]) 
homg_pt = np.array([0,0,1]) 
new_homg_pt = transf.dot(homg_pt)) 
new_homg_pt /= new_homg_pt[2] 
# new_homg_pt = [-10. 0. 1.]

完美！所以我們可以計算出所有的點映射與一個小的線性代數。我們需要得到所有的(x,y)分，並將它們放入一個巨大的數組中，以便每個點都在它自己的列中。讓我們假裝我們的形象只是4x4。

h, w = src.shape[:2] # 4, 4 
indY, indX = np.indices((h,w)) # similar to meshgrid/mgrid 
lin_homg_pts = np.stack((indX.ravel(), indY.ravel(), np.ones(indY.size)))

這些lin_homg_pts現在有充分的同質點：

[[ 0. 1. 2. 3. 0. 1. 2. 3. 0. 1. 2. 3. 0. 1. 2. 3.] 
[ 0. 0. 0. 0. 1. 1. 1. 1. 2. 2. 2. 2. 3. 3. 3. 3.] 
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]

然後我們可以做矩陣乘法讓每一個點的映射值。爲了簡單起見，讓我們堅持以前的單應性。

trans_lin_homg_pts = transf.dot(lin_homg_pts) 
trans_lin_homg_pts /= trans_lin_homg_pts[2,:]

現在我們必須轉換後的點：

[[-10. -9. -8. -7. -10. -9. -8. -7. -10. -9. -8. -7. -10. -9. -8. -7.] 
[ 0. 0. 0. 0. 1. 1. 1. 1. 2. 2. 2. 2. 3. 3. 3. 3.] 
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]

正如我們所看到的，一切都按預期工作：我們只是轉移了x - 值，通過-10。

像素可外面的圖像邊界

請注意，這些像素位置爲負偏移---他們是圖像的邊界外。如果我們做一些更復雜的事情並將圖像旋轉45度，我們會得到一些超出原始邊界的像素值。我們並不關心每個像素值，但我們只需要知道原始圖像像素位置之外最遠的像素有多遠，以便我們可以在遠處顯示原始圖像，然後在其上顯示變形圖像。

theta = 45*np.pi/180 
transf = np.array([ 
    [ np.cos(theta),np.sin(theta),0], 
    [-np.sin(theta),np.cos(theta),0], 
    [0.,0.,1.]]) 
print(transf) 
trans_lin_homg_pts = transf.dot(lin_homg_pts) 
minX = np.min(trans_lin_homg_pts[0,:]) 
minY = np.min(trans_lin_homg_pts[1,:]) 
maxX = np.max(trans_lin_homg_pts[0,:]) 
maxY = np.max(trans_lin_homg_pts[1,:]) 
# minX: 0.0, minY: -2.12132034356, maxX: 4.24264068712, maxY: 2.12132034356,

因此，我們看到我們可以在正負圖像的正負兩個方向上獲得原始圖像以外的像素位置。最小值x的值不會更改，因爲單應應用旋轉時，它會從左上角執行。現在需要注意的一點是，我已將變換應用於圖像中的所有像素。但這實際上是不必要的，你可以簡單地扭曲四個角落點，看看他們落在哪裏。

填充目標圖像

需要注意的是，當你調用cv2.warpAffine()你輸入目的地的大小。這些變換的像素值引用該大小。所以如果像素被映射到(-10,0)，它將不會出現在目標圖像中。這意味着我們必須做出另一種單應性的翻譯，其中所有的像素位置都是正的，然後我們可以填充圖像矩陣來補償我們的移位。如果單應性移動指向比圖像更大的位置，那麼我們還必須在底部和右側填充原始圖像。

在最近的例子中，min x的值是相同的，所以我們不需要水平轉換。但是，最小值已經下降了大約兩個像素，所以我們需要將圖像向下移動兩個像素。首先，我們來創建填充的目標圖像。

pad_sz = list(src.shape) # in case three channel 
pad_sz[0] = np.round(np.maximum(pad_sz[0], maxY) - np.minimum(0, minY)).astype(int) 
pad_sz[1] = np.round(np.maximum(pad_sz[1], maxX) - np.minimum(0, minX)).astype(int) 
dst_pad = np.zeros(pad_sz, dtype=np.uint8) 
# pad_sz = [6, 4, 3]

正如我們所看到的，高度從原來的增加了兩個像素，以說明這種轉變。

添加翻譯轉化到所有的像素位置轉移到積極

現在，我們需要創建一個新的單應矩陣由我們移了相同數量的扭曲圖像轉換。爲了應用這兩種轉換 - 原始和新的轉變 - 我們必須編寫兩個單應性（對於仿射轉換，您可以簡單地添加翻譯，但不適用於單應性）。此外，我們還需要通過最後一個條目來劃分，以確保規模仍然正確（再次，僅適用於單應）：

anchorX, anchorY = 0, 0 
transl_transf = np.eye(3,3) 
if minX < 0: 
    anchorX = np.round(-minX).astype(int) 
    transl_transf[0,2] -= anchorX 
if minY < 0: 
    anchorY = np.round(-minY).astype(int) 
    transl_transf[1,2] -= anchorY 
new_transf = transl_transf.dot(transf) 
new_transf /= new_transf[2,2]

這裏我也創建了，我們將目標圖像放入填充錨點矩陣;它會被相同數量的單應變移動圖像。因此，讓我們把填充的矩陣內的目標圖像：

dst_pad[anchorY:anchorY+dst_sz[0], anchorX:anchorX+dst_sz[1]] = dst

經新轉變爲填充的圖像

所有我們剩下要做的就是應用新的變換，將源圖像（加襯目的地大小），然後我們可以覆蓋這兩個圖像。

warped = cv2.warpPerspective(src, new_transf, (pad_sz[1],pad_sz[0])) 

alpha = 0.3 
beta = 1 - alpha 
blended = cv2.addWeighted(warped, alpha, dst_pad, beta, 1.0)

全部放在一起

讓我們創建一個函數爲這個，因爲我們創造我們不需要在這裏就結束了相當多的變量。對於輸入，我們需要源圖像，目標圖像和原始單應性。對於輸出，我們只需要填充的目標圖像和扭曲的圖像。請注意，在示例中，我們使用了3x3單應性，因此我們最好確保發送3x3變換而不是2x3仿射或歐幾里德經線。你可以將[0,0,1]行添加到底部的任何仿射變形中，你就會好起來的。

def warpPerspectivePadded(img, dst, transf): 

    src_h, src_w = src.shape[:2] 
    lin_homg_pts = np.array([[0, src_w, src_w, 0], [0, 0, src_h, src_h], [1, 1, 1, 1]]) 

    trans_lin_homg_pts = transf.dot(lin_homg_pts) 
    trans_lin_homg_pts /= trans_lin_homg_pts[2,:] 

    minX = np.min(trans_lin_homg_pts[0,:]) 
    minY = np.min(trans_lin_homg_pts[1,:]) 
    maxX = np.max(trans_lin_homg_pts[0,:]) 
    maxY = np.max(trans_lin_homg_pts[1,:]) 

    # calculate the needed padding and create a blank image to place dst within 
    dst_sz = list(dst.shape) 
    pad_sz = dst_sz.copy() # to get the same number of channels 
    pad_sz[0] = np.round(np.maximum(dst_sz[0], maxY) - np.minimum(0, minY)).astype(int) 
    pad_sz[1] = np.round(np.maximum(dst_sz[1], maxX) - np.minimum(0, minX)).astype(int) 
    dst_pad = np.zeros(pad_sz, dtype=np.uint8) 

    # add translation to the transformation matrix to shift to positive values 
    anchorX, anchorY = 0, 0 
    transl_transf = np.eye(3,3) 
    if minX < 0: 
     anchorX = np.round(-minX).astype(int) 
     transl_transf[0,2] += anchorX 
    if minY < 0: 
     anchorY = np.round(-minY).astype(int) 
     transl_transf[1,2] += anchorY 
    new_transf = transl_transf.dot(transf) 
    new_transf /= new_transf[2,2] 

    dst_pad[anchorY:anchorY+dst_sz[0], anchorX:anchorX+dst_sz[1]] = dst 

    warped = cv2.warpPerspective(src, new_transf, (pad_sz[1],pad_sz[0])) 

    return dst_pad, warped

運行功能

最後的例子中，我們可以調用一些真實圖像和單應此功能，看看它是如何平移出來。我會從LearnOpenCV借這個例子：

src = cv2.imread('book2.jpg') 
pts_src = np.array([[141, 131], [480, 159], [493, 630],[64, 601]], dtype=np.float32) 
dst = cv2.imread('book1.jpg') 
pts_dst = np.array([[318, 256],[534, 372],[316, 670],[73, 473]], dtype=np.float32) 

transf = cv2.getPerspectiveTransform(pts_src, pts_dst) 

dst_pad, warped = warpPerspectivePadded(src, dst, transf) 

alpha = 0.5 
beta = 1 - alpha 
blended = cv2.addWeighted(warped, alpha, dst_pad, beta, 1.0) 
cv2.imshow("Blended Warped Image", blended) 
cv2.waitKey(0)

我們結束了這個填充變形的圖像：

[Padded and warped[1]

而不是在typical cut off warp你通常會得到。

來源

2017-06-09 14:04:32

哇，謝謝你這樣一個全面和組織良好的答案。我正在通過並停下來閱讀某些概念。當我完成時，我會提出我的最終解決方案。再次感謝！ – Bprodz

使用warpAffine一起顯示拼接圖像

回答