2011-09-23 114 views

回答

4

熵的解決方案似乎是有問題的,過於密集的計算。爲什麼不檢測邊緣?

我剛纔寫的Python代碼來解決這個同樣的問題我自己。我的背景是髒白的,所以我使用的標準是黑暗和顏色。我簡化了這個標準,只是取每個像素的R,B或B值中的最小值,這樣黑色或飽和紅色都顯得相同。我還使用了每行或每列最多最黑的像素的平均值。然後,我開始在每個邊緣,一直工作,直到我超過了門檻。

這裏是我的代碼:

#these values set how sensitive the bounding box detection is 
threshold = 200  #the average of the darkest values must be _below_ this to count (0 is darkest, 255 is lightest) 
obviousness = 50 #how many of the darkest pixels to include (1 would mean a single dark pixel triggers it) 

from PIL import Image 

def find_line(vals): 
    #implement edge detection once, use many times 
    for i,tmp in enumerate(vals): 
     tmp.sort() 
     average = float(sum(tmp[:obviousness]))/len(tmp[:obviousness]) 
     if average <= threshold: 
      return i 
    return i #i is left over from failed threshold finding, it is the bounds 

def getbox(img): 
    #get the bounding box of the interesting part of a PIL image object 
    #this is done by getting the darekest of the R, G or B value of each pixel 
    #and finding were the edge gest dark/colored enough 
    #returns a tuple of (left,upper,right,lower) 

    width, height = img.size #for making a 2d array 
    retval = [0,0,width,height] #values will be disposed of, but this is a black image's box 

    pixels = list(img.getdata()) 
    vals = []     #store the value of the darkest color 
    for pixel in pixels: 
     vals.append(min(pixel)) #the darkest of the R,G or B values 

    #make 2d array 
    vals = np.array([vals[i * width:(i + 1) * width] for i in xrange(height)]) 

    #start with upper bounds 
    forupper = vals.copy() 
    retval[1] = find_line(forupper) 

    #next, do lower bounds 
    forlower = vals.copy() 
    forlower = np.flipud(forlower) 
    retval[3] = height - find_line(forlower) 

    #left edge, same as before but roatate the data so left edge is top edge 
    forleft = vals.copy() 
    forleft = np.swapaxes(forleft,0,1) 
    retval[0] = find_line(forleft) 

    #and right edge is bottom edge of rotated array 
    forright = vals.copy() 
    forright = np.swapaxes(forright,0,1) 
    forright = np.flipud(forright) 
    retval[2] = width - find_line(forright) 

    if retval[0] >= retval[2] or retval[1] >= retval[3]: 
     print "error, bounding box is not legit" 
     return None 
    return tuple(retval) 

if __name__ == '__main__': 
    image = Image.open('cat.jpg') 
    box = getbox(image) 
    print "result is: ",box 
    result = image.crop(box) 
    result.show() 
+0

令我懊惱的是,這個答案只適用於短小的圖像。列表(img.getdata())崩潰我的整個計算機的更大的圖像我工作(我的4Mb,但我讀其他人報告類似的結果,只有1 MB圖像)。 – Permafacture

+0

「正確的」答案使用'pixels = numpy.asarray(img)'而不是getdata(),然後生成的numpy數組必須使用itertools.imap進行處理。我被困在這一點上。我發佈瞭解決方案,我決定在http://stackoverflow.com/questions/6136588/image-cropping-using-python/8696558 – Permafacture

2

對於初學者來說,Here is a similar questionHere is a related questionAnd a another related question

這裏只是一個想法,當然也有其他的方法。我想選擇任意作物邊緣,然後測量entropy *上線的任一側上,然後進行重新選擇作物線(可能使用類似一平分法),直到裁剪出部分的熵下降到低於定義的閾值。正如我想的那樣,您可能需要訴諸粗暴的尋根方法,因爲您不會很好地指示何時裁剪得太少。然後重複其餘3個邊緣。

*我記得發現,在引用網站的熵值法是不完全精確,但我找不到我的筆記

編輯(我敢肯定,這是一個SO後,但是。): 對於圖像部分的「空白」(熵除外)的其他標準可能是邊緣檢測結果上的對比度或對比度比率。