2013-02-22 137 views
8

使用pdfbox,是否可以將PDF(或PDF字節[])轉換爲圖像字節[]?我已經瀏覽了幾個在線的例子,我能找到的唯一例子描述瞭如何直接將轉換後的文件寫入文件系統或將其轉換爲Java AWT對象。pdfbox將pdf轉換爲圖像byte []

我寧願不招致將圖像文件寫入文件系統的IO,讀入一個字節[],然後將其刪除。

所以這個我可以這樣做:

String destinationImageFormat = "jpg"; 
boolean success = false; 
InputStream is = getClass().getClassLoader().getResourceAsStream("example.pdf"); 
PDDocument pdf = PDDocument.load(is, true); 

int resolution = 256; 
String password = ""; 
String outputPrefix = "myImageFile"; 

PDFImageWriter imageWriter = new PDFImageWriter();  

success = imageWriter.writeImage(pdf, 
        destinationImageFormat, 
        password, 
        1, 
        2, 
        outputPrefix, 
        BufferedImage.TYPE_INT_RGB, 
        resolution); 

除了這一點:

InputStream is = getClass().getClassLoader().getResourceAsStream("example.pdf"); 

PDDocument pdf = PDDocument.load(is, true); 
List<PDPage> pages = pdf.getDocumentCatalog().getAllPages(); 

for (PDPage page : pages) 
{ 
    BufferedImage image = page.convertToImage(); 
} 

如果我不是清楚是怎麼變換分析數據的BufferedImage成一個byte []。我知道這是轉換成imageWriter.writeImage()中的文件輸出流,但我不清楚API的工作原理。

回答

11

您可以使用ImageIO.write寫入OutputStream。要得到一個字節[],請使用ByteArrayOutputStream,然後在其上調用toByteArray()。

+1

感謝。這按預期工作。如果我有足夠的聲望,我會投你一票,但這是我第一次發佈到StackOverflow。 – user2100746 2013-02-22 22:08:26

+0

不客氣,你應該能夠將其標記爲已接受。 – aditsu 2013-02-22 22:09:19

+0

@ user2100746您應該將答案標記爲已接受:) – Genjuro 2013-05-21 08:45:42

0
try {   
       PDDocument document = PDDocument.load(PdfInfo.getPDFWAY()); 
       if (document.isEncrypted()) { 
        document.decrypt(PdfInfo.getPASSWORD()); 
       } 
       if ("bilevel".equalsIgnoreCase(PdfInfo.getCOLOR())) { 
        PdfInfo.setIMAGETYPE(BufferedImage.TYPE_BYTE_BINARY); 
       } else if ("indexed".equalsIgnoreCase(PdfInfo.getCOLOR())) { 
        PdfInfo.setIMAGETYPE(BufferedImage.TYPE_BYTE_INDEXED); 
       } else if ("gray".equalsIgnoreCase(PdfInfo.getCOLOR())) { 
        PdfInfo.setIMAGETYPE(BufferedImage.TYPE_BYTE_GRAY); 
       } else if ("rgb".equalsIgnoreCase(PdfInfo.getCOLOR())) { 
        PdfInfo.setIMAGETYPE(BufferedImage.TYPE_INT_RGB); 
       } else if ("rgba".equalsIgnoreCase(PdfInfo.getCOLOR())) { 
        PdfInfo.setIMAGETYPE(BufferedImage.TYPE_INT_ARGB); 
       } else { 
        System.exit(2); 
       } 
       PDFImageWriter imageWriter = new PDFImageWriter(); 
       boolean success = imageWriter.writeImage(document, PdfInfo.getIMAGE_FORMAT(),PdfInfo.getPASSWORD(), 
         PdfInfo.getSTART_PAGE(),PdfInfo.getEND_PAGE(),PdfInfo.getOUTPUT_PREFIX(),PdfInfo.getIMAGETYPE(),PdfInfo.getRESOLUTION()); 
       if (!success) { 
        System.exit(1); 
       } 
       document.close(); 

     } catch (IOException | CryptographyException | InvalidPasswordException ex) { 
      Logger.getLogger(PdfToImae.class.getName()).log(Level.SEVERE, null, ex); 
     } 
public class PdfInfo { 
    private static String PDFWAY;  
    private static String OUTPUT_PREFIX; 
    private static String PASSWORD; 
    private static int START_PAGE=1; 
    private static int END_PAGE=Integer.MAX_VALUE; 
    private static String IMAGE_FORMAT="jpg"; 
    private static String COLOR="rgb"; 
    private static int RESOLUTION=256; 
    private static int IMAGETYPE=24; 
    private static String filename; 
    private static String filePath=""; 
} 
0

添加Maven的依賴:

<!-- https://mvnrepository.com/artifact/org.apache.pdfbox/pdfbox --> 
    <dependency> 
     <groupId>org.apache.pdfbox</groupId> 
     <artifactId>pdfbox</artifactId> 
     <version>2.0.1</version> 
    </dependency> 

而且,CONVER一個PDF格式的圖像:

import org.apache.pdfbox.pdmodel.PDDocument; 
import org.apache.pdfbox.rendering.PDFRenderer; 
import javax.imageio.ImageIO; 

private List<String> savePDF(String filePath) throws IOException { 
    List<String> result = Lists.newArrayList(); 

    File file = new File(filePath); 

    PDDocument doc = PDDocument.load(file); 
    PDFRenderer renderer = new PDFRenderer(doc); 

    int pageSize = doc.getNumberOfPages(); 
    for (int i = 0; i < pageSize; i++) { 
     String pngFileName = file.getPath() + "." + (i + 1) + ".png"; 

     FileOutputStream out = new FileOutputStream(pngFileName); 
     ImageIO.write(renderer.renderImageWithDPI(i, 96), "png", out); 
     out.close(); 

     result.add(pngFileName); 
    } 
    doc.close(); 
    return result; 
} 

編輯:

import org.apache.pdfbox.pdmodel.PDDocument; 
import org.apache.pdfbox.rendering.PDFRenderer; 
import javax.imageio.ImageIO; 

private List<String> savePDF(String filePath) throws IOException { 
    List<String> result = Lists.newArrayList(); 

    File file = new File(filePath); 

    PDDocument doc = PDDocument.load(file); 
    PDFRenderer renderer = new PDFRenderer(doc); 

    int pageSize = doc.getNumberOfPages(); 
    for (int i = 0; i < pageSize; i++) { 
     String pngFileName = file.getPath() + "." + (i + 1) + ".png"; 

     ByteArrayOutputStream out = new ByteArrayOutputStream(pngFileName); 
     ImageIO.write(renderer.renderImageWithDPI(i, 96), "png", out); 

     out.toByteArray(); // here you can get a byte array 

     out.close(); 

     result.add(pngFileName); 
    } 
    doc.close(); 
    return result; 
} 
+0

OP要求讓pdfbox直接將pdf呈現給'byte []'而不是文件。另一方面,你的回答只能顯示將其呈現給文件的另一種方式。 – mkl 2016-12-27 07:11:59

+0

將FileOutputStream替換爲ByteArrayOutputStream – BeeNoisy 2016-12-27 09:12:27

+0

'「ByteArrayOutputStream out = new ByteArrayOutputStream(pngFileName)」''ByteArrayOutputStream'只有兩個構造函數,一個不帶參數,另一個帶int參數。因此,使用'String'參數的調用甚至不會編譯,除非您的意思是不同於'java.io'中的'ByteArrayOutputStream'。 – mkl 2016-12-29 20:53:28