將pdf的部分內容渲染爲圖像

-1

是否有任何工具將PDF文檔渲染爲具有部分內容的圖像？例如，只有文本，但沒有圖像和矢量，或者只有圖像和矢量，但沒有文本。將pdf的部分內容渲染爲圖像

2014-10-06 Yu Liang

它是否需要成爲ghostscript還是你還準備做一點Java編程？ – mkl 2014-10-06 07:02:19

歡迎任何建議。 – 2014-10-06 07:08:46

Apache Java庫PDFBox包含用於渲染PDF頁面的代碼（與當前的1.8.x版本相比，它在當前的2.0.0開發快照中得到了很大改進）。這段代碼基本上調用了'PageDrawer'類。你可以相當簡單地調整該類，只繪製你選擇的東西。 – mkl 2014-10-06 07:25:57

執行此操作的「傳統」方法是預處理PDF文件，以便只保留所需的元素，然後柵格化剩餘的文件。

舉例來說，我已經實現了PDF到iPad工作流程，其中callas pdfToolbox（注意，我連接到這家公司）用於在文本文件中分割PDF文件和「除文本「文件。之後，「除文本外的任何內容」文件都被柵格化，並且重新組合了兩個文件。

因此，無論您想要使用什麼工具，我都會看到該工具如何預處理文件以刪除無用的元素，或者如何拆分出您想要的文件。然後使用該工具的正常光柵化功能。

來源

2014-10-06 07:23:52

隨着Debenu Quick PDF Library你能做的提取方法有兩種：

1.PDF2Image只是文本，沒有圖像

DPL.LoadFromFile("my_file.pdf", ""); 
int image_count = DPL.FindImages(); //number of embedded images 
for(int i=0; i<=image_count; i++) 
{ 
    DPL.ClearImage(i); //clear the images 
} 
DPL.RenderageToFile(72, 1, 0, "just_text.bmp"); //save the file to image, without the images

下面是功能列表： http://www.debenu.com/docs/pdf_library_reference/ImageHandling.php

2 .PDF2Image只是文字，沒有圖像

DPL.LoadFromFile("my_file.pdf", ""); 
DPL.GetPageText(3); //this returns CSV string with the cordinates of the text 

//create new blank file 
//XPos is the horizontal position of the text - get it from the CSV string 
//YPos is the vertical position of the text - get it from the CSV string 
//your_text is the text to draw - get it from the CSV string 
DPL.DrawText(XPos, YPos, your_text); 
DPL.RenderageToFile(72, 1, 0, "just_text.bmp"); //save the file to image, without the images

來源

2014-10-27 15:40:23 zacharpali

將pdf的部分內容渲染爲圖像

回答

相關問題