我想要做的是使用Beautifulsoup從谷歌專利檔案中下載每個zip文件。以下是我迄今爲止編寫的代碼。但似乎我有麻煩讓文件下載到我的桌面上的目錄。任何幫助將不勝感激Beautifulsoup下載Google專利搜索中的所有.zip文件
from bs4 import BeautifulSoup
import urllib2
import re
import pandas as pd
url = 'http://www.google.com/googlebooks/uspto-patents-grants.html'
site = urllib2.urlopen(url)
html = site.read()
soup = BeautifulSoup(html)
soup.prettify()
path = open('/Users/username/Desktop/', "wb")
for name in soup.findAll('a', href=True):
print name['href']
linkpath = name['href']
rq = urllib2.request(linkpath)
res = urllib2.urlope
我應該得到的結果是,所有的zip文件都應該下載到一個特定的目錄。相反,我收到以下錯誤:
> #2015 --------------------------------------------------------------------------- AttributeError Traceback (most recent call last)
> <ipython-input-13-874f34e07473> in <module>() 17 print name['href'] 18
> linkpath = name['href'] ---> 19 rq = urllib2.request(namep) 20 res =
> urllib2.urlopen(rq) 21 path.write(res.read()) AttributeError: 'module'
> object has no attribute 'request' –
您遇到什麼麻煩?預期的結果是什麼?會發生什麼呢? –
它應該下載所有的zip文件,但是我得到這個錯誤。#2015 ----------------------------- ---------------------------------------------- AttributeError Traceback( )() 17 print name ['href'] 18 linkpath = name ['href'] ---> 19 rq = urllib2.request( namep) 20 res = urllib2.urlopen(rq) 21 path.write(res.read()) AttributeError:'module'object has no attribute'request' –
icomefromchaos