3
tesseract OCR有一個命令行界面,它允許我們識別帶有某些參數的圖像中的文本。在tesseract的命令行模式中檢測文本塊位置和大小
輸入argumetns是imagename(路徑圖像)outputbase(識別的文本的名稱)和-psm pagesegmode參數。
pagesegmode values are: 0 = Orientation and script detection (OSD) only. 1 = Automatic page segmentation with OSD. 2 = Automatic page segmentation, but no OSD, or OCR 3 = Fully automatic page segmentation, but no OSD. (Default) 4 = Assume a single column of text of variable sizes. 5 = Assume a single uniform block of vertically aligned text. 6 = Assume a single uniform block of text. 7 = Treat the image as a single text line. 8 = Treat the image as a single word. 9 = Treat the image as a single word in a circle. 10 = Treat the image as a single character. -l lang and/or -psm pagesegmode must occur before anyconfigfile.
但它可以寫圖書館立場和識別的文本塊的大小,到特定的文件,或者是內部的信息?
非常感謝! 這就是我所需要的。 –