1
我在使用Python機械化循環下載多個文件時遇到問題。我也使用美麗的湯4.這兩個包的文檔似乎沒有答案。使用Python機械化下載循環中的文件
這是我的代碼 - 請跳到實際循環。我列入參考的一切:
import mechanize, cookielib, os, time
from bs4 import BeautifulSoup
fcList = ['abandoned mine land inventory points', 'abandoned mine land inventory polygons', \
'abandoned mine land inventory sites', 'coal mining operations', 'coal pillar location-mining', \
'industrial mineral mining operations', 'longwall mining panels', 'mine drainage treatment/land recycling project locations', \
'mined out areas', 'residual waste operations', 'underground mining permit']
dlLink = 'FTP Download'
dloadPath = 'C:\\Users\\SomeGuy\\Downloads'
# Browser
br = mechanize.Browser()
# Cookie Jar
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)
# Select the first (index zero) form
br.select_form(nr=0)
# Input form data
br.form['Keyword']='mining'
br.submit()
html = br.response().read()
# Pass html to beautiful soup for parse
soup = BeautifulSoup(html)
htmlinks = soup.findAll("a")
# Find links with desired text
for htmlink in htmlinks:
string = str(htmlink.string)
if string.lower() in fcList:
print "Matched link!", string + ". attempting download...\n"
try:
req = br.click_link(text = string)
br.open(req)
print "URL: " + str(br.geturl)
html = br.response().read()
soup = BeautifulSoup(html)
the_tag = soup.find('a', text=dlLink)
fileURL = the_tag.get('href')
print fileURL
# attempt download
fnam = string.replace(" ", "_")
fnam = fnam.replace("/", "_")
f = br.retrieve(fileURL, os.path.join(dloadPath, fnam + ".zip"))
print f + "\n"
br.back()
except:
print "An unknown error occurred."
輸出:
>>>
Matched link! Abandoned Mine Land Inventory Points. attempting download...
URL: <bound method Browser.geturl of <mechanize._mechanize.Browser instance at 0x02D9D7B0>>
http://www.pasda.psu.edu/data/dep/AMLInventoryPoints2013_04.zip
An unknown error occurred.
Matched link! Abandoned Mine Land Inventory Polygons. attempting download...
An unknown error occurred.
Matched link! Abandoned Mine Land Inventory Sites. attempting download...
An unknown error occurred.
Matched link! Coal Mining Operations. attempting download...
An unknown error occurred.
Matched link! Coal Pillar Location-Mining. attempting download...
An unknown error occurred.
Matched link! Industrial Mineral Mining Operations. attempting download...
An unknown error occurred.
Matched link! Longwall Mining Panels. attempting download...
An unknown error occurred.
Matched link! Mine Drainage Treatment/Land Recycling Project Locations. attempting download...
An unknown error occurred.
Matched link! Mined Out Areas. attempting download...
An unknown error occurred.
Matched link! Residual Waste Operations. attempting download...
An unknown error occurred.
Matched link! Underground Mining Permit. attempting download...
An unknown error occurred.
>>>
我認爲這個問題可能是由於有下載之間沒有等待時間。無論選擇哪一個,此代碼都會成功下載循環中的第一個文件。或者也許是我不知道的其他一些錯誤 - 我昨天剛剛下載了機械化和美觀!
謝謝!對不起,很長的延遲...我會盡力而爲,儘快回覆你!我認爲這永遠不會得到答覆。這是我在這裏的第一個問題,我沒有問得很好。 –