2011-08-10 195 views
1

我已經創建了一個包含文件名的數組的腳本。該腳本通過遞歸搜索通過目錄和子目錄的pdf文件並將它們添加到數組中。然後它將一個字符串輸出到pdftk的命令行中,以合併它們。路徑問題(命令行)

PDFTK接受參數如:

pdftk inputpdf1.pdf inputpdf2.pdf cat output output.pdf 

但是,它似乎是輸入的路徑不是按照錯誤信息,我從在cmd(如上所列)的窗口得到正確的。我在Ubuntu上遇到同樣的錯誤。

Microsoft Windows XP [Version 5.1.2600] 
(C) Copyright 1985-2001 Microsoft Corp. 

C:\Documents and Settings\student3>cd C:\Documents and Settings\student3\Desktop 
\Test 

C:\Documents and Settings\student3\Desktop\Test>pdftest.py 
Merging C:\Documents and Settings\student3\Desktop\Test\1.pdf 
pdftk "C:\Documents and Settings\student3\Desktop\Test\1.pdf" cat outputC:\Docum 
ents and Settings\student3\Desktop\Test\Output\.pdf 
Error: Unexpected text in page reference, here: 
    outputC:\Documents 
    Exiting. 
    Acceptable keywords, here, are: "even", "odd", or "end". 
Errors encountered. No output created. 
Done. Input errors, so no output created. 
pdftk "C:\Documents and Settings\student3\Desktop\Test\1.pdf" cat outputC:\Docum 
ents and Settings\student3\Desktop\Test\Output\.pdf 
Error: Unexpected text in page reference, here: 
    outputC:\Documents 
    Exiting. 
    Acceptable keywords, here, are: "even", "odd", or "end". 
Errors encountered. No output created. 
Done. Input errors, so no output created. 
Merging C:\Documents and Settings\student3\Desktop\Test\2.pdf 
pdftk "C:\Documents and Settings\student3\Desktop\Test\2.pdf" cat outputC:\Docum 
ents and Settings\student3\Desktop\Test\Output\.pdf 
Error: Unexpected text in page reference, here: 
    outputC:\Documents 
    Exiting. 
    Acceptable keywords, here, are: "even", "odd", or "end". 
Errors encountered. No output created. 
Done. Input errors, so no output created. 
pdftk "C:\Documents and Settings\student3\Desktop\Test\2.pdf" cat outputC:\Docum 
ents and Settings\student3\Desktop\Test\Output\.pdf 
Error: Unexpected text in page reference, here: 
    outputC:\Documents 
    Exiting. 
    Acceptable keywords, here, are: "even", "odd", or "end". 
Errors encountered. No output created. 
Done. Input errors, so no output created. 
Merging C:\Documents and Settings\student3\Desktop\Test\brian\1.pdf 
pdftk "C:\Documents and Settings\student3\Desktop\Test\brian\1.pdf" cat outputC: 
\Documents and Settings\student3\Desktop\Test\Output\.pdf 
Error: Unexpected text in page reference, here: 
    outputC:\Documents 
    Exiting. 
    Acceptable keywords, here, are: "even", "odd", or "end". 
Errors encountered. No output created. 
Done. Input errors, so no output created. 
pdftk "C:\Documents and Settings\student3\Desktop\Test\brian\1.pdf" cat outputC: 
\Documents and Settings\student3\Desktop\Test\Output\.pdf 
Error: Unexpected text in page reference, here: 
    outputC:\Documents 
    Exiting. 
    Acceptable keywords, here, are: "even", "odd", or "end". 
Errors encountered. No output created. 
Done. Input errors, so no output created. 
Merging C:\Documents and Settings\student3\Desktop\Test\brian\2.pdf 
pdftk "C:\Documents and Settings\student3\Desktop\Test\brian\2.pdf" cat outputC: 
\Documents and Settings\student3\Desktop\Test\Output\.pdf 
Error: Unexpected text in page reference, here: 
    outputC:\Documents 
    Exiting. 
    Acceptable keywords, here, are: "even", "odd", or "end". 
Errors encountered. No output created. 
Done. Input errors, so no output created. 
pdftk "C:\Documents and Settings\student3\Desktop\Test\brian\2.pdf" cat outputC: 
\Documents and Settings\student3\Desktop\Test\Output\.pdf 
Error: Unexpected text in page reference, here: 
    outputC:\Documents 
    Exiting. 
    Acceptable keywords, here, are: "even", "odd", or "end". 
Errors encountered. No output created. 
Done. Input errors, so no output created. 
Merging C:\Documents and Settings\student3\Desktop\Test\testing\1.pdf 
pdftk "C:\Documents and Settings\student3\Desktop\Test\testing\1.pdf" cat output 
C:\Documents and Settings\student3\Desktop\Test\Output\.pdf 
Error: Unexpected text in page reference, here: 
    outputC:\Documents 
    Exiting. 
    Acceptable keywords, here, are: "even", "odd", or "end". 
Errors encountered. No output created. 
Done. Input errors, so no output created. 
pdftk "C:\Documents and Settings\student3\Desktop\Test\testing\1.pdf" cat output 
C:\Documents and Settings\student3\Desktop\Test\Output\.pdf 
Error: Unexpected text in page reference, here: 
    outputC:\Documents 
    Exiting. 
    Acceptable keywords, here, are: "even", "odd", or "end". 
Errors encountered. No output created. 
Done. Input errors, so no output created. 
Merging C:\Documents and Settings\student3\Desktop\Test\testing\2.pdf 
pdftk "C:\Documents and Settings\student3\Desktop\Test\testing\2.pdf" cat output 
C:\Documents and Settings\student3\Desktop\Test\Output\.pdf 
Error: Unexpected text in page reference, here: 
    outputC:\Documents 
    Exiting. 
    Acceptable keywords, here, are: "even", "odd", or "end". 
Errors encountered. No output created. 
Done. Input errors, so no output created. 
pdftk "C:\Documents and Settings\student3\Desktop\Test\testing\2.pdf" cat output 
C:\Documents and Settings\student3\Desktop\Test\Output\.pdf 
Error: Unexpected text in page reference, here: 
    outputC:\Documents 
    Exiting. 
    Acceptable keywords, here, are: "even", "odd", or "end". 
Errors encountered. No output created. 
Done. Input errors, so no output created. 
Finished Processing 

C:\Documents and Settings\student3\Desktop\Test> 

這是腳本代碼:

#---------------------------------------------------------------------------------------------- 
# Name:  pdfMerger 
# Purpose:  Automatic merging of all PDF files in a directory and its sub-directories and 
#    rename them according to the folder itself. Requires the pyPDF Module 
# 
# Current:  Processes all the PDF files in the current directory 
# To-Do:  Process the sub-directories. 
# 
# Version: 1.0 
# Author:  Brian Livori 
# 
# Created:  03/08/2011 
# Copyright: (c) Brian Livori 2011 
# Licence:  Open-Source 
#--------------------------------------------------------------------------------------------- 
#!/usr/bin/env python 

import os 
import glob 
import sys 
import fnmatch 
import subprocess 

path = str(os.getcwd()) 


x = 0 

def process_file(_, path, filelist): 
    os.path.walk(os.path.realpath(topdir), process_file,()) 
    input_param = " ".join('"' + x + '"' for x in glob.glob(os.path.join(path, "*.pdf")) 

    output_param = '"' + os.path.join(path, os.path.basename(path) + ".pdf") + '"' 

    cmd = "pdftk " + input_param + " cat output " + output_param 
    os.system(cmd) 


    for filenames in os.walk (path): 
     if "Output" in filenames: 
      filenames.remove ("Output") 

    if os.path.exists(final_output) != True: 

        os.mkdir(final_output) 
        sp = subprocess.Popen(cmd) 
        sp.wait() 


    else: 

        sp = subprocess.Popen(cmd) 
        sp.wait() 




def files_recursively(topdir): 
os.path.walk(os.path.realpath(topdir), process_file,()) 

files_recursively(path) 

print "Finished Processing" 

我究竟在做什麼錯?

File "C:\Documents and Settings\student3\Desktop\Test\pdftest2.py", line 32 
    output_param = '"' + os.path.join(path, os.path.basename(path) + ".pdf") + '"' 
      ^
SyntaxError: invalid syntax 
+0

我的意思不是PDFTK對不起PDFBOX。 – Brian

+0

添加了完整的腳本。 – Jacob

回答

3

由於空格的原因,您需要通過將它們用雙引號括起來來避開路徑。否則,你的shell將把每個空白解釋爲一個新文件的分隔符。

" ".join('"' + str(f) + '"' for f in filesArr) 

一些更多的東西:

  1. 你叫PDFTK每一個PDF。您應該將其從循環中移出並構建輸入文件列表。 (假設你要合併所有輸入的PDF文件到一個輸出PDF
  2. 你缺少貓輸出後的空間

    ... " cat output " + outputpath + ext)

  3. outputpath變量是空

編輯:

你的代碼有點混亂,我將process_file方法更改爲:

def process_file(_, path, filelist): 
    input_param = " ".join('"' + x + '"' for x in glob.glob(os.path.join(path, "*.pdf")) 
    output_param = '"C:\ENTER\OUTPUT\PATH\HERE.PDF"' 
    cmd = "pdftk " + input_param + " cat output " + output_param 
    os.system(cmd) 

我真的不明白你爲什麼需要所有這些任務。

編輯2:

這裏我完整的腳本:

#!/usr/bin/env python 

import os 
import glob 

def process_file(_, path, filelist): 
    input_param = " ".join('"' + x + '"' for x in glob.glob(os.path.join(path, "*.pdf")))) 
    output_param = '"' + os.path.join(path, os.path.basename(path) + ".pdf") + '"' 
    cmd = "pdftk " + input_param + " cat output " + output_param 
    print cmd 
    os.system(cmd) 

def files_recursively(topdir): 
    os.path.walk(os.path.realpath(topdir), process_file,()) 

if __name__ == "__main__": 
    files_recursively(os.getcwd()) 
Pastebin

命令它產生

這裏:

pdftk "/home/user/pdf/Test1.pdf" "/home/user/pdf/Test3.pdf" "/home/user/pdf/Test2.pdf" cat output "/home/user/pdf/pdf.pdf" 
pdftk "/home/user/pdf/Sub3/Test1.pdf" "/home/user/pdf/Sub3/Test3.pdf" "/home/user/pdf/Sub3/Test2.pdf" cat output "/home/user/pdf/Sub3/Sub3.pdf" 
pdftk "/home/user/pdf/Sub2/Test1.pdf" "/home/user/pdf/Sub2/Test3.pdf" "/home/user/pdf/Sub2/Test2.pdf" cat output "/home/user/pdf/Sub2/Sub2.pdf" 
pdftk "/home/user/pdf/Sub2/SubSub21/Test1.pdf" "/home/user/pdf/Sub2/SubSub21/Test3.pdf" "/home/user/pdf/Sub2/SubSub21/Test2.pdf" cat output "/home/user/pdf/Sub2/SubSub21/SubSub21.pdf" 
pdftk "/home/user/pdf/Sub2/SubSub22/Test1.pdf" "/home/user/pdf/Sub2/SubSub22/Test3.pdf" "/home/user/pdf/Sub2/SubSub22/Test2.pdf" cat output "/home/user/pdf/Sub2/SubSub22/SubSub22.pdf" 
pdftk "/home/user/pdf/Sub1/Test1.pdf" "/home/user/pdf/Sub1/Test3.pdf" "/home/user/pdf/Sub1/Test2.pdf" cat output "/home/user/pdf/Sub1/Sub1.pdf" 
pdftk "/home/user/pdf/Sub1/SubSub2/Test1.pdf" "/home/user/pdf/Sub1/SubSub2/Test3.pdf" "/home/user/pdf/Sub1/SubSub2/Test2.pdf" cat output "/home/user/pdf/Sub1/SubSub2/SubSub2.pdf" 
pdftk "/home/user/pdf/Sub1/SubSub1/Test1.pdf" "/home/user/pdf/Sub1/SubSub1/Test3.pdf" "/home/user/pdf/Sub1/SubSub1/Test2.pdf" cat output "/home/user/pdf/Sub1/SubSub1/SubSub1.pdf" 
+0

我仍然得到相同的錯誤。我認爲這是與os.getcwd() – Brian

+0

有關你可能沒有逃避輸出路徑?在執行並將其編輯到您的問題之前,將您的'cmd'變量打印到shell。 – Jacob

+0

你是怎麼做到的,我的意思是輸出cmd變量? – Brian

0

相反的os.system()你應該使用subprocess.Popen - 這個模塊的內容處理如果您將命令和參數作爲列表輸入,則可以使用文件名中的空格。

在Windows上:Popen類使用CreateProcess()來執行子程序 ,該程序對字符串進行操作。如果參數是一個序列,則將使用list2cmdline方法將其轉換爲字符串 。請注意, 並非所有MS Windows應用程序都將命令行解釋爲相同的 方式:list2cmdline是爲與MS C運行時使用相同 規則的應用程序而設計的。

在你的榜樣,那將是

cmd = ["pdftk"] + files_arr + "cat", "output", outputpath + ext] 

然後

sp = subprocess.Popen(cmd) 
sp.wait() 
+0

我試過了,但我仍然得到相同的錯誤。我應該更改任何Python代碼嗎? – Brian