在RDLC中包含圖像會導致意想不到的大輸出大小

我有一個RDLC報告，其中包含一些數據和（可選）圖像。內容呈現爲PDF。在RDLC中包含圖像會導致意想不到的大輸出大小

我可能有一個容器（包）文件，其中存儲了100個相同的結果。問題是，如果我包含圖像，結果輸出的數量會比預期的數量增加。

作爲一個例子;我的RDLC報表是一張發票，可以在底部顯示簽名圖片的圖像。我可能在一個客戶的包文件中有100個發票。

如果沒有圖像的總輸出包（100張發票）的大小是2MB，並且圖像是15 KB，那麼預計圖像的總輸出包將在3.5MB（2MB + 15KB * 100）。問題是我得到的總輸出包超過8MB。

是否有可用於減輕這種輸出的大小，或其他任何方式去獲得的輸出大小與預期更一致的任何技術？

2012-03-02 StingyJack

不知道rdlc是什麼。但我認爲，以PDF格式呈現時，15KB圖像不一定必須是15KB。這是因爲爲web製作的典型圖像的分辨率爲72dpi。當包含在PDF中時，軟件通常會將其轉換爲200-300dpi以獲得最佳的打印質量。一張100x100像素的圖像因此在200dpi時變成〜278x278px的圖像; 10,000px圖像被轉換爲77,000px，你做數學。 – 2012-03-05 14:39:32

由於沒有添加新信息，PDF渲染器保存上採樣圖像將是愚蠢的。上採樣可以等到打印時間。但大量的軟件確實愚蠢的事情... – japreiss 2012-03-05 14:46:22

你能告訴我你的圖像類型（JPG，PNG，TIF），它的顏色深度（1bpp，8bpp，24bpp等）及其大小（寬度和高度像素）？ – iPDFdev 2012-03-05 15:17:21

根據您的PDF生成器的功能，圖像可以保存爲弱，無損或甚至不壓縮。您可以使用以下方法從PDF中提取圖像信息，以檢查這是否屬於您的情況。如果是這樣，可以使用一些「PDF compression」軟件來解決這個問題。

（這可能看起來很奇怪，但我真的沒有發現任何預先編寫的軟件可以做到這一點）

安裝Python 2.x和PDFMiner包（參見PDFMiner manual#cmap安裝步驟），然後使用以下代碼列出文檔中的所有圖像，它們的大小和壓縮。有關PDF使用的壓縮算法列表和說明，請參閱PDF specification，第23頁（「標準過濾器」表）。

from pdfminer.pdfparser import PDFParser, PDFDocument 
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter 
from pdfminer.pdfdevice import PDFDevice 

# Open a PDF file. 
fp = open('Reader.pdf', 'rb') 
# Create a PDF parser object associated with the file object. 
parser = PDFParser(fp) 
# Create a PDF document object that stores the document structure. 
doc = PDFDocument() 
# Connect the parser and document objects. 
parser.set_document(doc) 
doc.set_parser(parser) 
# Supply the password for initialization. 
# (If no password is set, give an empty string.) 
doc.initialize('') 
# Check if the document allows text extraction. If not, abort. 
if not doc.is_extractable: 
    raise PDFTextExtractionNotAllowed 
# Create a PDF resource manager object that stores shared resources. 
rsrcmgr = PDFResourceManager() 

from pdfminer.layout import LAParams, LTImage 
from pdfminer.converter import PDFPageAggregator 

# Set parameters for analysis. 
laparams = LAParams() 
# Create a PDF page aggregator object. 
device = PDFPageAggregator(rsrcmgr, laparams=laparams) 
interpreter = PDFPageInterpreter(rsrcmgr, device) 

#Build layout trees of all pages 
layouts=[] 
for page in doc.get_pages(): 
    interpreter.process_page(page) 
    # receive the LTPage object for the page. 
    layouts.append(device.get_result()) 

#search the trees for images and show their info, 
# excluding repeating ones 
known_ids=set() 
count=0;size=0 
def lsimages(obj): 
    global count; global size 
    if hasattr(obj,'_objs'): 
     for so in obj._objs: 
      if isinstance(so,LTImage): 
       i=so; id=i.stream.attrs['ID'].objid 
       if id not in known_ids: 
        a=i.stream.attrs 
        print a 
        count+=1;size+=a.get('Length',0) 
        known_ids.add(id) 
      lsimages(so) 
for l in layouts: 
    lsimages(l) 
print "Total: %d images, %d bytes"%(count,size)

Credits：樣板代碼取自Programming with PDFMiner文章。

來源

2012-03-07 04:27:48

在RDLC中包含圖像會導致意想不到的大輸出大小

回答

相關問題