發票/成像：消除圖像中的噪音

我在這裏有一個黑色/白色圖像，我準備準備好放入OCR中，即Tesseract。但是Tesseract無法檢測到任何噪聲區域。 enter image description here 發票/成像：消除圖像中的噪音

我在這裏尋找什麼樣的解決方案來消除噪音？由於Tesseract無法識別它，我認爲去除是最好的選擇。

來源

2013-09-23 skiwi

您可以使用TextCleaner，一個ImageMagick腳本來清理文本背景。

來源

2013-09-23 23:30:39 nguyenq

當我回來了，我會嘗試這個下週在我的工作上，雖然我希望授權費不太差。 – skiwi

如果萬一你正在尋找一個Python代碼，這裏的人會去除噪聲

import cv2 
import numpy as np 

# load color image 
im = cv2.imread('input.jpg') 

# smooth the image with alternative closing and opening 
# with an enlarging kernel 
morph = im.copy() 

kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, 1)) 
morph = cv2.morphologyEx(morph, cv2.MORPH_CLOSE, kernel) 
morph = cv2.morphologyEx(morph, cv2.MORPH_OPEN, kernel) 

kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 2)) 

# take morphological gradient 
gradient_image = cv2.morphologyEx(morph, cv2.MORPH_GRADIENT, kernel) 

# split the gradient image into channels 
image_channels = np.split(np.asarray(gradient_image), 3, axis=2) 

channel_height, channel_width, _ = image_channels[0].shape 

# apply Otsu threshold to each channel 
for i in range(0, 3): 
    _, image_channels[i] = cv2.threshold(~image_channels[i], 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY) 
    image_channels[i] = np.reshape(image_channels[i], newshape=(channel_height, channel_width, 1)) 

# merge the channels 
image_channels = np.concatenate((image_channels[0], image_channels[1], image_channels[2]), axis=2) 

# save the denoised image 
cv2.imwrite('output.jpg', image_channels)

工作上面的代碼不會得到好的結果，如果你正在處理的圖像是發票（或有在白色背景上的大量文字）。爲了得到這樣的圖像了良好的效果，去除

gradient_image = cv2.morphologyEx(morph, cv2.MORPH_GRADIENT, kernel)

，並通過morph OBJ的分割功能，並刪除了~符號內的循環

來源

2018-01-19 12:39:54

發票/成像：消除圖像中的噪音

回答

相關問題