2016-05-14 53 views
0

嗨,大家好,我試圖運行正方體,並得到從圖像中的文本,但我會遇到以下錯誤:安卓:正方體無法加載任何語言

Exception in thread "main" java.lang.Error: Invalid memory access 
at com.sun.jna.Native.invokePointer(Native Method) 
at com.sun.jna.Function.invokePointer(Function.java:477) 
at com.sun.jna.Function.invoke(Function.java:411) 
at com.sun.jna.Function.invoke(Function.java:323) 
at com.sun.jna.Library$Handler.invoke(Library.java:236) 
at com.sun.proxy.$Proxy0.TessBaseAPIGetUTF8Text(Unknown Source) 
at net.sourceforge.tess4j.Tesseract.getOCRText(Tesseract.java:436) 
at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:291) 
at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:212) 
at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:196) 
at Crop_Image.main(Crop_Image.java:98) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:606) 
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144) 
Error opening data file ./tessdata/eng.traineddata 
Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory. 
Failed loading language 'eng' 
Tesseract couldn't load any languages! 

我加載包含英語jpg圖像文件文本。這是我嘗試加載該文件,然後試圖從它那裏得到的文本:

public static void main(String[] args){ 

    String result = ""; 

    File imageFile = new File("C:\\Users\\user\\Desktop\\Untitled.jpg"); 
    Tesseract instance = new Tesseract(); 

    try { 
     result = instance.doOCR(imageFile); 
     result.toString(); 

    } catch (Exception e) { 
     e.printStackTrace(); 
     System.err.println(e.getMessage()); 
    } 
} 

而且我使用Maven,這裏是我的pom文件我也有我的項目中:

<dependencies> 

    <dependency> 
     <groupId>nu.pattern</groupId> 
     <artifactId>opencv</artifactId> 
     <version>2.4.9-4</version> 
    </dependency> 

    <dependency> 
     <groupId>net.sourceforge.tess4j</groupId> 
     <artifactId>tess4j</artifactId> 
     <version>3.1.0</version> 
    </dependency> 

</dependencies> 

什麼可能是這個錯誤的原因?

回答

2

我看到了您的代碼,並且初始化Tesseract時可能存在問題。現在,因爲你正在使用maven作爲nguyenq建議你需要精確地指向庫的位置 - tessdata所以這裏是你應該做的:

public static String Image_To_Text(String image_path){ 

    String result = ""; 

    File imageFile = new File("your path to your image"); 

    Tesseract instance = Tesseract.getInstance(); 
    //In case you don't have your own tessdata, let it also be extracted for you 
    File tessDataFolder = LoadLibs.extractTessResources("tessdata"); 

    //Set the tessdata path 
    instance.setDatapath(tessDataFolder.getAbsolutePath()); 

    try { 
     result = instance.doOCR(imageFile); 

    } catch (Exception e) { 
     e.printStackTrace();    
    } 

    return result; 
} 
+0

非常感謝!這解決了我的問題。 – user6006748

0

您需要設置instance.setDatapathtessdata文件夾的父目錄。

File tessDataFolder = LoadLibs.extractTessResources("tessdata"); // Maven build bundles English data 
instance.setDatapath(tessDataFolder.getParent()); 

參見http://tess4j.sourceforge.net/tutorial

+0

是的,我想那麼多,但我使用'maven'所以即使我指向'.jar'文件的目錄也是如此。 – user6006748

+0

您需要先提取它。查看更新。 – nguyenq

+0

@nguyenq hi nguyen!我是使用tesseract的產品VietocR的忠實粉絲。我必須開發同樣的東西,但可以識別阿拉伯字符。我可以得到你的代碼並嘗試修改它來識別阿拉伯文嗎?我試過tesseract,但文件ara.traineddata不是很好,我沒有得到我想要的結果。那麼你能幫助我嗎? – Hohenheim