添加用戶單詞到Tesseract

我在我的android應用程序中使用Tesseract。我定義了我的「用戶詞」文件，並且爲ocr添加了粗體行來考慮用戶詞文件。添加用戶單詞到Tesseract

String language = "deu"; 
datapath = getFilesDir()+ "/tesseract/"; 
Tess = new TessBaseAPI(); 

checkFile(new File(datapath + "tessdata/")); 
**Tess.setVariable("user_words_suffix","deu.user-words");** 
Tess.init(datapath, language);

我沒有定義用戶模式文件，因爲我的圖像中沒有任何特定的模式。我只是在tessdata文件夾中複製了due.user-words的UTF-8 txt文件。這對於ocr配置足夠了嗎？或者我應該解壓due_traindata並將此文件添加到due_traindata然後打包它？如果是的話，你可以給我一些關於如何做到這一點的提示。

來源

2016-12-14 MKH

你並不需要指定代碼的語言前綴：

Tess.setVariable("user_words_suffix", "user-words");

確保文件的前綴指定的語言代碼一致 - 即，deu.user-words。

https://github.com/tesseract-ocr/tesseract/blob/master/doc/tesseract.1.asc https://github.com/tesseract-ocr/tesseract/wiki/ControlParams

來源

2016-12-16 04:17:12 nguyenq

添加用戶單詞到Tesseract

回答

相關問題