使用PDFBox將圖像轉換爲字節[]

我正在使用PDFBox 2.0。在解析PDF文檔時，我也希望將第一頁作爲圖像存儲並存儲到hbase以在搜索結果中使用它（我將創建一個搜索列表頁面，如amazon.com的搜索頁面）。使用PDFBox將圖像轉換爲字節[]

HBase的接受字節[]變量來存儲（索引）的值。我需要將圖像轉換爲byte []，然後將其存儲到HBase。我已經實現了圖像渲染，但是如何將它轉換爲byte []？

 PDDocument document = PDDocument.load(file, ""); 
     BufferedImage image = null; 
     try { 
      PDFRenderer pdfRenderer = new PDFRenderer(document); 
      if (document.isEncrypted()) { 
       try { 
        System.out.println("Trying to decrypt...); 
        document.setAllSecurityToBeRemoved(true); 
        System.out.println("The file has been decrypted in ."); 
       } 
       catch (Exception e) { 
        throw new Exception("cannot be decrypted. ", e); 
       } 
      } 
      PDPage firstPage = (PDPage) document.getDocumentCatalog().getPages().get(0); 
      pdfRenderer.renderImageWithDPI(0, 300, ImageType.RGB); 
       // 0 means first page. 

      image = pdfRenderer.renderImageWithDPI(0, 300, ImageType.RGB);     
      document.close(); 

    } catch (Exception e) { 
      e.printStackTrace(); 
    }

如果我寫ImageIOUtil.writeImage(image , fileName+".jpg" ,300);以上的正上方document.close();，程序創建項目路徑中的JPG文件。我需要把它放在一個byte []數組中而不是創建一個文件。可能嗎？

來源

2016-04-19 Munchmallow

這可以用ImageIO.write(Image, String, OutputStream)可寫入到任意的OutputStream，而不是磁盤來完成。 ByteArrayOutputStream可以將輸出字節存儲到內存中的數組中。

import java.io.ByteArrayOutputStream; 
... 
// example image 
BufferedImage image = new BufferedImage(4, 3, BufferedImage.TYPE_INT_ARGB); 

// to array 
ByteArrayOutputStream bos = new ByteArrayOutputStream(); 
ImageIO.write(image, "jpg", bos); 
byte [] output = bos.toByteArray(); 
System.out.println(Arrays.toString(output));

來源

2016-04-19 20:33:27 Adam

什麼庫是ByteOutputStream使用？它是'com.sun.xml.internal.messaging.saaj.util.ByteOutputStream;'？ – Munchmallow

我的不好，應該是java.io.ByteArrayOutputStream這是核心Java類，更新了答案... – Adam

非常感謝。現在我必須考慮如何從hbase中獲取它並將其顯示爲搜索列表中的圖像。 – Munchmallow

使用PDFBox將圖像轉換爲字節[]

回答

相關問題