2016-02-09 46 views
2

我想將阿拉伯詞分割爲單個字符。基於直方圖/配置文件,我假設我可以通過根據基線(它具有相似的像素值)剪切/分割字符來執行分割過程。 但是,不幸的是,我仍然堅持構建適當的代碼,以使其工作。分割草書字符(阿拉伯語OCR)

% Original Code by Soumyadeep Sinha 
% Saving each single segmented character as one file 
function [segm] = trysegment (a) 
myFolder = 'D:\1. Thesis FINISH!!!\Data set\trial'; 
level = graythresh (a); 
bw = im2bw (a, level); 
b = imcomplement (bw); 
i= padarray(b,[0 10]); 
verticalProjection = sum(i, 1); 
set(gcf, 'Name', 'Trying Segmentation for Cursive', 'NumberTitle', 'Off') 
subplot(2, 2, 1);imshow(i); 
subplot(2,2,3); 
plot(verticalProjection, 'b-'); %histogram show by this code 
% hist(reshape(input,[],3),1:max(input(:))); 
grid on; 
% % t = verticalProjection; 
% % t(t==0) = inf; 
% % mayukh = min(t) 
% 0 where there is background, 1 where there are letters 
letterLocations = verticalProjection > 0; 
% Find Rising and falling edges 
d = diff(letterLocations); 
startingColumns = find(d>0); 
endingColumns = find(d<0); 
% Extract each region 
y=1; 
for k = 1 : length(startingColumns) 
    % Get sub image of just one character... 
    subImage = i(:, startingColumns(k):endingColumns(k)); 
% se = strel('rectangle',[2 4]); 
% dil = imdilate(subImage, se); 
    th = bwmorph(subImage,'thin',Inf); 
    n = imresize (th, [64 NaN], 'bilinear'); 
    figure, imshow (n); 
[L,num] = bwlabeln(n); 
for z= 1 : num 
bw= ismember(L, z); 
% Construct filename for this particular image. 
baseFileName = sprintf('char %d.png', y); 
y=y+1; 
% Prepend the folder to make the full file name. 
fullFileName = fullfile(myFolder, baseFileName); 
% Do the write to disk. 
imwrite(bw, fullFileName); 
% subplot(2,2,4); 
% pause(2); 
% imshow(bw); 
end 
% y=y+1; 
end; 
segm = (n); 

字圖像如下: <code>Segmenting cursive character</code>

爲什麼代碼不能正常工作? 你有推薦其他代碼嗎? 或建議的算法,使其工作,對草書字符做一個很好的分割?

以前感謝。

+0

矩陣'verticalProjection'的維數是多少? –

回答

1

從發佈代碼

% 0 where there is background, 1 where there are letters 
letterLocations = verticalProjection > 0; 
% Find Rising and falling edges 
d = diff(letterLocations); 
startingColumns = find(d>0); 
endingColumns = find(d<0); 

用新的代碼部分替換這部分代碼

threshold=max(verticalProjection)/3; 
thresholdedProjection=verticalProjection > threshold; 
count=0; 
startingColumnsIndex=0; 
for i=1:length(thresholdedProjection) 
    if thresholdedProjection(i) 
     if(count>0) 
      startingColumnsIndex=startingColumnsIndex+1; 
      startingColumns(startingColumnsIndex)= i-floor(count/2); 
      count=0; 
     end 
    else 
     count=count+1; 
    end 
end 
endingColumns=[startingColumns(2:end)-1 i-floor(count/2)]; 

所需代碼的其餘部分沒有變化。

+0

抱歉@Rijul遲到回覆。我需要再次幫助。我收到一個錯誤: 。 在segm中的錯誤 subImage = i(:, startingColumns(k):endingColumns(k)); 。 我應該做什麼? –

+0

非常感謝你@RijulSudhir。經過幾次修改,它可以工作。你能告訴代碼是如何工作的嗎?一個正確的解釋。因爲,我認爲,我仍然需要一些修改來達到我的預期結果。 –

+0

你能解釋一下,爲什麼threshold = max(verticalProjection)/ 3? ||因爲我有很好的數據圖像,並且圖像包含各種閾值,所以對於所有圖像都不成功。這些代碼還不適合所有圖像。 你能否向我解釋更多關於代碼的知識,所以我可以定義,我應該怎麼做來解決這個問題。謝謝 –