6
我是tesseract OCR的新手。我試圖將圖像轉換爲tif並運行它以查看在windows中使用cmd的tesseract的輸出,但是我不能。你可以幫我嗎?什麼將命令使用?我可以在Windows命令行中測試tesseract ocr嗎?
這裏是我的示例圖像:
我是tesseract OCR的新手。我試圖將圖像轉換爲tif並運行它以查看在windows中使用cmd的tesseract的輸出,但是我不能。你可以幫我嗎?什麼將命令使用?我可以在Windows命令行中測試tesseract ocr嗎?
這裏是我的示例圖像:
最簡單的tesseract.exe語法tesseract.exe inputimage output-text-file
。 這裏的假設是,tesseract.exe被添加到PATH
環境變量中。 如果您的文本參數特別難以識別,您可以添加-psm N
參數。
我發現正常的語法(沒有任何-psm
開關)對於附加的圖像來說工作得很好,除非精度水平不夠好。
請注意,非英文字符(例如處方旁邊的符號)未被識別;我的默認安裝只包含英語培訓數據。
這裏的正方體語法描述:
C:\Users\vish\Desktop>tesseract.exe
Usage:tesseract.exe imagename outputbase [-l lang] [-psm pagesegmode] [configfile...]
pagesegmode values are:
0 = Orientation and script detection (OSD) only.
1 = Automatic page segmentation with OSD.
2 = Automatic page segmentation, but no OSD, or OCR
3 = Fully automatic page segmentation, but no OSD. (Default)
4 = Assume a single column of text of variable sizes.
5 = Assume a single uniform block of vertically aligned text.
6 = Assume a single uniform block of text.
7 = Treat the image as a single text line.
8 = Treat the image as a single word.
9 = Treat the image as a single word in a circle.
10 = Treat the image as a single character.
-l lang and/or -psm pagesegmode must occur before anyconfigfile.
Single options:
-v --version: version info
--list-langs: list available languages for tesseract engine
這裏是爲您的圖像輸出(注:當我下載了它,它轉換成一個PNG圖像):
C:\Users\vish\Desktop>tesseract.exe ECL8R.png out.txt
Tesseract Open Source OCR Engine v3.02 with Leptonica
C:\Users\vish\Desktop>type out.txt.txt
1 Project Background
A prescription (R) is a written order by a physician or medical doctor to a pharmacist in the form of
medication instructions for an individual patient. You can't get prescription medicines unless someone
with authority prescribes them. Usually, this means a written prescription from your doctor. Dentists,
optometrists, midwives and nurse practitioners may also be authorized to prescribe medicines for you.
It can also be defined as an order to take certain medications.
A prescription has legal implications; this means the prescriber must assume his responsibility for the
clinical care ofthe patient.
Recently, the term "prescriptionΓÇ¥ has known a wider usage being used for clinical assessments,
請解釋一下你已經嘗試過更詳細的了。 – Vish 2014-10-09 10:29:27
@Vish我從它的網站安裝了tesseract庫。並從cmd我試圖轉換文本圖像。 tesseract imagename.tif輸出。但無法獲得任何輸出。 – Akunar 2014-10-09 23:57:28
對於您鍵入的語法,輸出存儲在文件output.txt中。你檢查過這個文件是否被創建?另外,你可以上傳你的TIF文件嗎?如果我有一些時間,我可以檢查我的tesseract安裝。 – Vish 2014-10-10 05:44:08