我想以3000左右的文件塊合併很多PDF文件。經過很多次嘗試,這個腳本似乎在伎倆。 (當然我是錯的)Bash PDF合併未命中文件
#!/bin/bash
basepath='/home/lemonidas/pdfstuff';
datename=`date "+%Y%m%d%H%M.%S"`;
start=`date "+%s"`;
echo "parsing pdf list to file..."
find $basepath/input/ -name "*.pdf" | xargs -I {} ls {} >> $basepath/tmp/biglist$datename.txt
split -l 3000 $basepath/tmp/biglist$datename.txt $basepath/tmp/splitfile
rm $basepath/tmp/biglist$datename.txt
echo "deleting big file..."
echo "done splitting!"
declare -i x
x=1
for f in $basepath/tmp/splitfile*
do
linenum=`cat $f | wc -l`;
echo "Processing $f ($linenum lines)..."
# merge to one big PDF
cat $f | xargs gs -q -sstdout=$basepath/error.log -sPAPERSIZE=letter -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=$basepath/output/$x.big.pdf 2>$basepath/error.log
echo "Completed PDF $x"
((x++))
# delete the list file
rm $f
echo "Deleted processed file $f"
done
end=`date "+%s"`;
echo "Started: $start"
echo "Finished: $end"
的問題是,我有22000 2頁的PDF文件,每個輸出文件(除了最後一個)應該是6000頁(因爲我們在每個合併列表3000個PDF文件,正如在解析之前由「wc -l」所證實的那樣),並且我只能得到大約658頁左右。據報道
沒有錯誤,除了這個由GS:
Warning: Embedded symbolic TT fonts must contain a cmap for Platform=1 Encoding=0.
Warning: Embedded symbolic TT fonts must contain a cmap for Platform=1 Encoding=0.
Warning: Embedded symbolic TT fonts must contain a cmap for Platform=1 Encoding=0.
Warning: Embedded symbolic TT fonts must contain a cmap for Platform=1 Encoding=0.
Warning: Embedded symbolic TT fonts must contain a cmap for Platform=1 Encoding=0.
Warning: Embedded symbolic TT fonts must contain a cmap for Platform=1 Encoding=0.
Warning: Embedded symbolic TT fonts must contain a cmap for Platform=1 Encoding=0.
Warning: Embedded symbolic TT fonts must contain a cmap for Platform=1 Encoding=0.
This file had errors that were repaired or ignored.
The file was produced by: >>>> Powered By Crystal Please notify the author of the software that produced this file that it does not conform to Adobe's published PDF specification.
一遍又一遍(但但不是22000次)
當我嘗試將其與300-400的文件,它運行順利,但是當我在2.5小時後嘗試全面運行,我獲得的合併文件數量少於一半。
我的下一個想法是轉換.pgm文件中的每個2頁PDF,但我不知道如何將它們重新制作爲PDF(以免出現字體嵌入問題)。 我錯過了什麼嗎? (可能)
+1 PDFTK建議 – mouviciel
我使用GS 8.61 我希望,如果這個過程哽咽,它至少報告一個錯誤。我會嘗試使用pdftk並回報。謝謝! – lemonidas
8.61已經很老了(現在已經快5歲了),現在的版本是9.06 – KenS