2016-03-04 60 views
0

我嘗試使用tesseract tess-two從android中的圖像讀取問題和回答。目前,我得到一個字符串與圖像上的每一個字。 我的問題是我不能拆分答案 是否有可能與TessBaseAPI拆分answear?在Java/Android的一個解決方案也將是罰款;)與Tesseract tess-two在Android上的分詞

public String detectText(Bitmap bitmap) { 
    Log.d(TAG, "Initialization of TessBaseApi"); 
    TessDataManager.initTessTrainedData(context); 
    TessBaseAPI tessBaseAPI = new TessBaseAPI(); 
    String path = TessDataManager.getTesseractFolder(); 
    Log.d(TAG, "Tess folder: " + path); 
    tessBaseAPI.setDebug(true); 
    tessBaseAPI.init(path, "eng"); 
    tessBaseAPI.setVariable(TessBaseAPI.VAR_CHAR_WHITELIST, "1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ" + 
      "abcdefghijklnmopqrstuvwxyzäüößÄÖÜ[email protected]#$%^&*+=-;()/"); 
    tessBaseAPI.setPageSegMode(TessBaseAPI.OEM_TESSERACT_CUBE_COMBINED); 

    Log.d(TAG, "Ended initialization of TessEngine"); 
    Log.d(TAG, "Running inspection on bitmap"); 
    tessBaseAPI.setImage(bitmap); 

    String inspection = tessBaseAPI.getUTF8Text(); 
    Log.d(TAG, "Got data: " + inspection); 
    tessBaseAPI.end(); 
    System.gc(); 
    return inspection; 
} 

Here is an example how the image look like

回答

0

這是它的工作方式:

tessBaseAPI.setPageSegMode(TessBaseAPI.PageSegMode.PSM_SPARSE_TEXT);