無法使用python下載PDF

我想使用python腳本下載pdf。我曾嘗試使用urlib，pdfkit和curl。當我嘗試下載pdf時，我正在獲取頁面的html/js內容而不是pdf文件。請幫我解決這個問題。無法使用python下載PDF

使用pdfkit：

import pdfkit 
pdfkit.from_url('http://www.kubota.com/product/BSeries/B2301/pdf/B01_Specs.pdf', 'out.pdf', options = {'javascript-delay':'10000'})

使用的urllib：

import urllib2 
response = urllib2.urlopen('http://www.kubota.com/product/BSeries/B2301/pdf/B01_Specs.pdf') 
file = open("out.pdf", 'wb') 
file.write(response.read()) 
file.close()

來源

2017-04-24 Vamshi Kolanu

您可以使用urllib3庫

import urllib3 

def download_file(download_url): 
    http = urllib3.PoolManager() 
    response = http.request('GET', download_url) 
    f = open('output.pdf', 'wb') 
    f.write(response.data) 
    f.close() 

if __name__ == '__main__': 
    download_file('http://www.kubota.com/product/BSeries/B2301/pdf/B01_Specs.pdf')

來源

2017-04-24 23:38:33 eyllanesc

很好...它的工作原理！ –

你應該能夠requests很容易

做

import requests 

r = requests.get('http://www.axmag.com/download/pdfurl-guide.pdf') #your url here 
with open('your_file_path_here.pdf', 'wb') as f: 
    f.write(r.content)

來源

2017-04-24 23:50:28 slearner

其實，只是嘗試與您的鏈接，它看起來像有一個驗證碼/某種形式的身份驗證之前，你可以得到你正在尋找的PDF，所以這可能是問題，而不是你的代碼 – slearner

感謝您的答覆。我該如何解決它？我是否需要通過頭文件發送一些信息來破解身份驗證？ –

無法使用python下載PDF

回答

相關問題