2011-07-21 45 views
4

的I/O操作無法計算出來 這個函數(將網站抓取成PDF的類的一部分)應該使用pypdf合併從網頁生成的pdf文件。關閉python pypdf - 寫作的問題。得到一個值錯誤:關閉的文件

這是該方法的代碼:

def mergePdf(self,mainname,inputlist=0): 
    """merging the pdf pages 
    getting an inputlist to merge or defaults to the class instance self.pdftomerge list""" 
    from pyPdf import PdfFileWriter, PdfFileReader 
    self._mergelist = inputlist or self.pdftomerge 
    self.pdfoutput = PdfFileWriter() 

    for name in self._mergelist: 
     print "merging %s into main pdf file: %s" % (name,mainname) 
     self._filestream = file(name,"rb") 
     self.pdfinput = PdfFileReader(self._filestream) 
     for p in self.pdfinput.pages: 
      self.pdfoutput.addPage(p) 
     self._filestream.close() 

    self._pdfstream = file(mainname,"wb") 
    self._pdfstream.open() 
    self.pdfoutput.write(self._pdfstream) 
    self._pdfstream.close() 

我不斷收到此錯誤:

File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 264, in write 
    self._sweepIndirectReferences(externalReferenceMap, self._root) 
    File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 339, in _sweepIndirectReferences 
    self._sweepIndirectReferences(externMap, realdata) 
    File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 315, in _sweepIndirectReferences 
    value = self._sweepIndirectReferences(externMap, value) 
    File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 339, in _sweepIndirectReferences 
    self._sweepIndirectReferences(externMap, realdata) 
    File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 315, in _sweepIndirectReferences 
    value = self._sweepIndirectReferences(externMap, value) 
    File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 324, in _sweepIndirectReferences 
    value = self._sweepIndirectReferences(externMap, data[i]) 
    File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 339, in _sweepIndirectReferences 
    self._sweepIndirectReferences(externMap, realdata) 
    File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 315, in _sweepIndirectReferences 
    value = self._sweepIndirectReferences(externMap, value) 
    File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 345, in _sweepIndirectReferences 
    newobj = data.pdf.getObject(data) 
    File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 645, in getObject 
    self.stream.seek(start, 0) 
ValueError: I/O operation on closed file 

,但是當我檢查self._pdfstream的狀態,我得到:

<open file 'c:\python27\learn\dive.pdf', mode 'wb' at 0x013B2020> 

我究竟做錯了什麼?

我會很高興的任何幫助

回答

6

好的,我發現你的問題。你是正確的電話file()。不要試圖撥打open()

你的問題是輸入文件仍然需要當你調用self.pdfoutput.write(self._pdfstream)是開放的,所以你需要刪除線self._filestream.close()

編輯:此腳本將觸發該問題。第一次寫入會成功,第二次寫入失敗。

from pyPdf import PdfFileReader as PfR, PdfFileWriter as PfW 

input_filename = 'in.PDF' # replace with a real file 
output_filename = 'out.PDF' # something that doesn't exist 

infile = file(input_filename, 'rb') 
reader = PfR(infile) 
writer = PfW() 

writer.addPage(reader.getPage(0)) 
outfile = file(output_filename, 'wb') 
writer.write(outfile) 
print "First Write Successful!" 
infile.close() 
outfile.close() 

infile = file(input_filename, 'rb') 
reader = PfR(infile) 
writer = PfW() 

writer.addPage(reader.getPage(0)) 
outfile = file(output_filename, 'wb') 
infile.close() # BAD! 

writer.write(outfile) 
print "You'll get an IOError Before this line" 
outfile.close() 
+0

嘿agf,因爲我寫我的問題是與self._pdfstream。我改變了打開,但這沒有幫助。我仍然遇到錯誤,當我嘗試從pypdf寫入時,並且當我檢查該對象時,我仍然得到 - <打開文件'c:\ python27 \ learn \ dive.pdf',模式'wb'在0x013B2020>。 WTF? – alonisser

+0

@alonisser你說得對,調用'open()'是錯誤的!但是你的問題不是'self._pdfstream',它與輸入流一起。編輯我的答案。 – agf

+0

這似乎解決了這個問題 - 非常感謝!但現在還有另一個問題!我得到相同的長錯誤字符串和一個不同的結束:行693,在readObjectHeader 返回INT(IDNUM),INT(生成) ValueError:無效文字爲int()與基10:''任何想法 – alonisser