2012-06-05 44 views
2

正如你可以看到,我做了一個程序,掃描一個文件,並有選擇地獲取頁面信息和材料&大小的信息和日期信息。如何在VB6中進行區域OCR?

enter image description here

當我使用OCR掃描這樣的:

Dim Mdoc As MODI.Document 
Dim Mlay As MODI.Layout 
Dim fso As Scripting.FileSystemObject 
Dim logfile As Object 

Public Function ScanMan(ByVal Name As String, ByVal Path As String) As String 
    Set Mdoc = New MODI.Document 
    'Set Mdoc = CreateObject("MODI.Document") 
    Set fso = New Scripting.FileSystemObject 

    DoEvents 
    '''''''''''''''''''''''''''''''''''''''''''''''''''''''''' 
    ''''''''''''''''''' Create OCRLog File ''''''''''''''''''' 
    OCRPath = App.Path & "\OCR Results Log\" 
    OCRName = Str(DateTime.Date) & " OCRresults" 
    If fso.FolderExists(OCRPath) = False Then 
     fso.CreateFolder (OCRPath) 
    End If 
    If fso.FileExists(OCRPath & OCRName & ".txt") = False Then 
     fso.CreateTextFile OCRPath & OCRName & ".txt" 
    End If 
    Set logfile = fso.OpenTextFile(OCRPath & OCRName & ".txt", ForAppending) 
    '''''''''''''''''''''''''''''''''''''''''''''''''''''''''' 
    '''''''''''''''''''''''''''''''''''''''''''''''''''''''''' 

    On Error GoTo OCRErr 
    DoEvents 
    Mdoc.Create Path & "\" & Name 
    Mdoc.Images(0).OCR miLANG_ENGLISH, True, True 
    logfile.Write Mdoc.Images(0).Layout.Text 

    ScanMan = Mlay.Text 

    Mdoc.Close False 

    Set Mlay = Nothing 
    Set Mdoc = Nothing 

    Exit Function 

OCRErr: 
    logfile.WriteLine "OCR given (" & Err.Number & ") numbered (" & Err.Description & ") error." 
    logfile.Close 
End Function 

這得到了整個頁面,但我只是想要掃描的3 spesific區域,這樣我怎麼能achive呢?有沒有什麼功能?哪些只掃描X,Y座標?

回答

2

VB6的片段

Sub TestTextSelection() 

    Dim miTextSel As MODI.IMiSelectableItem 
    Dim miSelectRects As MODI.miSelectRects 
    Dim miSelectRect As MODI.miSelectRect 
    Dim strTextSelInfo As String 

    Set miTextSel = MiDocView1.TextSelection 
    Set miSelectRects = miTextSel.GetSelectRects 
    strTextSelInfo = _ 
    "Bounding rectangle page & coordinates: " & vbCrLf 
    For Each miSelectRect In miSelectRects 
    With miSelectRect 
     strTextSelInfo = strTextSelInfo & _ 
     .PageNumber & ", " & .Top & ", " & _ 
     .Left & ", " & .Bottom & ", " & _ 
     .Right & vbCrLf 
    End With 
    Next 
    MsgBox strTextSelInfo, vbInformation + vbOKOnly, _ 
    "Text Selection Info" 

    Set miSelectRect = Nothing 
    Set miSelectRects = Nothing 
    Set miTextSel = Nothing 

End Sub 

雖然這個問題被標記爲vb6,但答案是從vb.Net 2010。我希望vb.NET可以很容易地轉換爲vb6,只是問題只是幾個時間。

的基本思想是創建圖像的XML文件,然後運行在XML文件的查詢,以獲取所需的塊由(x1,y1)包圍和(x2,y2)的文本。

The core class

Imports System 
Imports System.IO 
Imports System.Xml 
Imports System.Linq 
Imports MODI 

Public Class clsCore 
    Public Sub New() 
     'blah blah blah 
    End Sub 

    Public Function GetTextFromCoordinates(ByVal iPath$, ByVal x1&, ByVal y1&, ByVal x2&, ByVal y2&) As String 
     Try 
      Dim xDoc As XElement = Me.ConvertImage2XML(iPath) 
      If IsNothing(xDoc) = False Then 
       Dim result As New XElement(<text/>) 
       Dim query = xDoc...<wd>.Where(Function(c) Val(CStr([email protected])) >= x1 And Val(CStr([email protected])) <= x2 And Val(CStr([email protected])) >= y1 And Val(CStr([email protected])) <= y2) 
       For Each ele As XElement In query 
        result.Add(CStr(ele.Value) & " ") 
       Next ele 
       Return Trim(result.Value) 
      Else 
       Return "" 
      End If 
     Catch ex As Exception 
      Console.WriteLine(ex.ToString) 
      Return ex.ToString 
     End Try 
    End Function 

    Private Function ConvertImage2XML(ByVal iPath$) As XElement 
     Try 
      If File.Exists(iPath) = True Then 
       Dim miDoc As New MODI.Document 
       Dim result As New XElement(<image path=<%= iPath %>/>) 
       miDoc.Create(iPath) 
       For Each miImg As MODI.Image In miDoc.Images 
        Dim page As New XElement(<page id=<%= result...<page>.Count + 1 %>/>) 
        miImg.OCR() 
        For Each miWord As MODI.Word In miImg.Layout.Words 
         Dim wd As New XElement(<wd block=<%= miWord.RegionId.ToString %>><%= miWord.Text %></wd>) 
         For Each miRect As MODI.MiRect In miWord.Rects 
          wd.Add(New XAttribute("left", miRect.Left)) 
          wd.Add(New XAttribute("top", miRect.Top)) 
          wd.Add(New XAttribute("right", miRect.Right)) 
          wd.Add(New XAttribute("bottom", miRect.Bottom)) 
         Next miRect 
         page.Add(wd) 
        Next miWord 
        result.Add(page) 
       Next miImg 
       Return result 
      Else 
       Return Nothing 
      End If 
     Catch ex As Exception 
      Console.WriteLine(ex.ToString) 
      Return Nothing 
     End Try 
    End Function 
End Class 

main module

Imports System 
Imports System.IO 
Imports System.Text.RegularExpressions 

Module modMain 

    Sub Main() 
     Dim iPath$ = "", iPos$ = "150,825,1400,1200" 
     Console.WriteLine("Enter path to file:") 
     iPath = Console.ReadLine() 
     Console.WriteLine("") 
     Console.WriteLine("Enter co-ordinates(i.e., x1,y1,x2,y2 or 150,825,1400,1200):") 
     iPos = Console.ReadLine() 
     Dim tmp As String() = Regex.Split(iPos, "\D+") 
     Dim outText$ = New clsCore().GetTextFromCoordinates(iPath, tmp(0), tmp(1), tmp(2), tmp(3)) 
     Console.WriteLine("") 
     Console.WriteLine(String.Format("{0}[({1},{2})-({3},{4})]:{5}{5}{6}", Dir(iPath), tmp(0), tmp(1), tmp(2), tmp(3), vbCrLf, outText)) 
     Console.ReadLine() 
    End Sub 

End Module 

UPDATE

以下示例報告查看器控件中圍繞用戶圖像選擇的邊界矩形的頁碼和座標。並且可以在以後在picturebox中使用。

Sub TestImageSelection() 

    Dim miImageSel As MODI.IMiSelectableImage 
    Dim lngPageNo As Long 
    Dim lngLeft As Long, lngTop As Long 
    Dim lngRight As Long, lngBottom As Long 
    Dim strImageSelInfo As String 

    Set miImageSel = MiDocView1.ImageSelection 
    miImageSel.GetBoundingRect lngPageNo, _ 
    lngLeft, lngTop, lngRight, lngBottom 
    strImageSelInfo = _ 
    "Page number: " & lngPageNo & vbCrLf & _ 
    "Bounding rectangle coordinates: " & vbCrLf & _ 
    lngLeft & ", " & lngTop & ", " & _ 
    lngRight & ", " & lngBottom 
    MsgBox strImageSelInfo, vbInformation + vbOKOnly, _ 
    "Image Selection Info" 

    Set miImageSel = Nothing 

End Sub 

希望這有助於。

+0

這是一個很好的伎倆,但正如我所說的「絕招」。我想在PictureBox的實際得到的圖片保存爲另一種PIC的確切點。比它上的OCR ..但是這也有幫助。我期待一些其他的答案,因爲這個原因+1,但很可能你的答案是接受的答案。 –

+1

''Picturebox''也可能是一個解決方案。查看我的更新。 – Cylian

1

我使用圖像和圖片框來裁切和調整圖片的大小以準確地包含在高清電影中。我使用滑塊控件移動了圖片(例如PicSize.Value) 圖片框設置爲1900x1080像素,屏幕爲Visible=false。 圖像框大小有Stretch設置爲true與大小並不重要,並顯示最終裁剪圖片的較小版本。

我將圖片保存爲bmp,因此它與我的AVCHD視頻在Adobe編輯器中的整合與視頻具有相同的幀大小。

這是主要的子程序:

-Private Sub Convert() 
'Creates a cropped and/or magnified fixed pixel 1900x1080 picture 
Dim file_name As String, LeftPos As Long 
Picture2.Picture = LoadPicture("") 
DoEvents 
' Resize the picture. 
LeftPos = 950 + HPos.Value - PicSize.Value/2 + PicWidth.Value * 20 
Picture2.PaintPicture Picture1.Picture, _ 
    LeftPos, VPos.Value, _ 
    PicSize.Value - (PicSize.Value * (PicWidth.Value/50)), _ 
    PicSize.Value * (Aspect.Value/100) 
Picture2.Picture = Picture2.Image 
TopValue.Caption = VPos.Value 
HPosValue.Caption = HPos.Value 
SizeValue.Caption = PicSize.Value 
AspectValue.Caption = Aspect.Value - 75 
StretchValue.Caption = PicWidth.Value 
Image1.Picture = Picture2.Image 'preview it 
End Sub 
+0

不錯的一個非常有用確實.. –

相關問題