0
我嘗試使用requests
庫下載多個pdf,並使用pypdf將它們合併在一起。一般來說,這工作正常,但對於一些PDF,我只是得到一個錯誤。Unicode錯誤PyPdf
MWE.py
import requests
from pyPdf import PdfFileWriter, PdfFileReader
from StringIO import StringIO
input = PdfFileReader(StringIO(response.content))
input.decrypt("")
output = PdfFileWriter()
output.addPage(input.getPage(0))
outputStream = file("document-output.pdf", "wb")
output.write(outputStream)
outputStream.close()
session.close()
錯誤
Traceback (most recent call last):
File "mwe.py", line 21, in <module>
input.decrypt("")
File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 894, in decrypt
return self._decrypt(password)
File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 904, in _decrypt
user_password, key = self._authenticateUserPassword(password)
File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 945, in _authenticateUserPassword
encrypt.get("/EncryptMetadata", BooleanObject(False)).getObject())
File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 1818, in _alg35
key = _alg32(password, rev, keylen, owner_entry, p_entry, id1_entry)
File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 1729, in _alg32
m.update(id1_entry)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)
對於跟蹤我從文件中讀取輸入,但我不認爲它在這種情況下很重要。
我發現這個問題有一些相關的問題,但我無法解決我的具體問題。
你打算分享追蹤的其餘部分嗎? –
解密方法中發生錯誤不是嗎?其實pdf沒有加密,但我發現這個解決方法與空密碼。否則,它會在addPage方法內出現'Exception:file has not decrypted'錯誤。 –
你爲什麼使用'file'?你應該真的使用'打開' –