有一個非常罕見的方法來提取數據,但它只適用於老版本的ghostscript,如8.51或8.62。在舊版本的ghostscript中,PDF命令是在/lib/pdf_ops.ps中定義的。新版本還有其他一些功能。
版本8.62的測試版本可在此處獲得。
http://sourceforge.net/projects/ghostscript/files/GPL%20Ghostscript/8.62/gs862w32.exe/download
你後面的文本是用/Tj {} def
和/TJ {} def
通過添加dup ==
每個定義的開始打印。 (這可能會更復雜)我也沒有擔心字體警告消息,但如果數據寫入文件,這些會被過濾掉。
由於字距正在完成,因此有些字被分割成單獨的字母。考慮到時間,這也可以被過濾。
改性/ TJ從pdf_ops.ps /TJ {DUP == 0 0通過MoveTo顯示settextposition } bdef
改性從pdf_ops.ps
/TJ
/TJ { dup ==
0 0 moveto {
dup type /stringtype eq {
Show
} { -1000 div
currentfont /ScaleMatrix .knownget { 0 get mul } if
0 Vexch rmoveto
} ifelse
} forall settextposition
} bdef
輸出
(Help a neighbor within your county each month by contributing to The Salvation)
(Army's Project SHARE and Georgia Power will match your gift. To help, simply check)
($1, $2, $5, or $10 on the return portion of this bill. Starting next month, your pledge)
(amount will be included on your monthly bill.)
(Our business offices will be closed on December 24 and 25 for Christmas and January)
(1 for New Year's Day. In case of an emergency, please call us at the number on your)
(bill 24 hours a day, 7 days a week.)
(PLEASE KEEP THIS PORTION FOR YOUR RECORDS.)
(PLEASE RETURN THIS PORTION WITH YOUR PAYMENT, MAKING SURE THE RETURN ADDRESS SHOWS IN THE ENVELOPE WINDOW.)
(Account Number)
(Mail To:)
不是後記的樂趣嗎?
你好,你有沒有試過從PDF中刪除圖像,以便PDF只包含文本?我正在尋找一種方法來做到這一點。你有使用ghostScript或其他cli工具的解決方案嗎?請幫助。 – codin 2013-12-19 09:55:21