2014-02-25 71 views
1

我已經嘗試了tess4j作爲一個獨立的java程序,它正確地給出了文本輸出。tess4j with spring mvc

現在我正在嘗試創建一個彈簧mvc web項目,在pom中添加tess4j的依賴關係,並且我在項目中添加了tess4j源代碼。

File imageFile = new File("D:/Data/jars/tess/eurotext.tif");  
Tesseract instance = Tesseract.getInstance(); // JNA Interface Mapping 
     // Tesseract1 instance = new Tesseract1(); // JNA Direct Mapping 
     try { 
      result = instance.doOCR(imageFile); 
      System.out.println(result); 
     } catch (TesseractException e) { 
      System.err.println(e.getMessage()); 
     } 

上面的代碼工作正常,當我試圖運行project.so內一個獨立的Java程序,它明確指出,jar文件添加到正確構建路徑。

但是當我調用控制器映射或服務中的代碼時,它會引發運行時異常。

SEVERE: Unsupported image format. May need to install JAI Image I/O package. 
https://java.net/projects/jai-imageio/ 
java.lang.RuntimeException: Unsupported image format. May need to install JAI Image I/O package. 
https://java.net/projects/jai-imageio/ 
    at net.sourceforge.vietocr.ImageIOHelper.getIIOImageList(ImageIOHelper.java:324) 
    at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:173) 
    at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:158) 
    at com.ocr.tesseract.TesseractExample.getTextFromImage(TesseractExample.java:27) 
    at com.cogz.tp.controller.HomeController.view(HomeController.java:51) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
    at java.lang.reflect.Method.invoke(Method.java:597) 
    at org.springframework.web.method.support.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:214) 
    at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:132) 
    at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:104) 
    at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandleMethod(RequestMappingHandlerAdapter.java:748) 
    at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:689) 
    at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:83) 
    at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:945) 
    at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:876) 
    at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:931) 
    at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:822) 
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:621) 
    at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:807) 
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:728) 
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305) 
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) 
    at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:88) 
    at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:108) 
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) 
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) 
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) 
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) 
    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:502) 
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) 
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:100) 
    at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953) 
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) 
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:409) 
    at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1044) 
    at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607) 
    at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:313) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) 
    at java.lang.Thread.run(Thread.java:662) 
java.lang.RuntimeException: Unsupported image format. May need to install JAI Image I/O package. 
https://java.net/projects/jai-imageio/ 

請讓我知道什麼是missing.Thanks提前。

+0

它看起來就像遇到了麻煩'宰imageio'圖書館。它是否正確加載? – nguyenq

+0

沒有它din加載正常。我發現它調試並添加了一個新的帖子http://stackoverflow.com/questions/22035048/imageio-jar-works-as-standalone-but-not-as-a-web-project – user3321883

+1

你可能想在'doOCR'之前調用'ImageIO.scanForPlugins();'。 – nguyenq

回答

4

即使我面臨使用tess4j作爲DynamicWebProject的類似問題。但是感謝@nguyenq給我的評論,我幫了我的忙。 大多數tess4j使用TIFF處理器進行光學識別。默認ImageIO不支持它所需的依賴項。 因此,jai-imageio.jar是必需的。在我調用執行doOCR的包裝類之前,我所做的只是添加了行ImageIO.scanForPlugins()。 我已經按照我的lib罐子: -

tess4j.jar

jai_imageio.jar

ghost4j-0.3.1.jar

jna.jar

的junit-4.10。罐子

以下是示例代碼:

TessractOCR tessocr = new TessractOCR(); 
     ImageIO.scanForPlugins(); 
     String extractedString = tessocr.extractTextFromImage(binarizrImage); 

功能

public static String extractTextFromImage(BufferedImage image){ 
     RenderedImage img = image; 

     String result =null; 
     try { 
      File outputfile = new File("saved.png"); 
     ImageIO.write(img, "png", outputfile); 
     Tesseract instance = Tesseract.getInstance(); // JNA Interface Mapping 
     instance.setDatapath("E:\\OCR-data\\Tess4J-1.2-src\\Tess4J"); 

     result = instance.doOCR(outputfile); 


      System.out.println(result); 

     } catch (Exception e) { 
      System.err.println(e.getMessage()); 
     } 
     return result; 
    } 

它可以100%:)

0

下面是工作的代碼共享所有:

public static String doOCR(File pdfInvoice) { 
     String result = ""; 
     long totalTime = 0; 
     long endTime = 0; 
     long startTime = System.currentTimeMillis(); 
     File imageFile = new File("D:\\docfolder\\9011121584.pdf"); 
     Tesseract instance = Tesseract.getInstance(); // 

     try { 

      ImageIO.scanForPlugins(); 
      result = instance.doOCR(imageFile); 

      endTime = System.currentTimeMillis(); 
      totalTime = endTime - startTime; 
      System.out.println("Total Time Taken For OCR: " + (totalTime/1000)); 
      return result; 
     } catch (Exception e) { 
      System.err.println(e.getMessage()); 
      result = ""; 
      return result; 
     } 
    }