我試圖用機械化模塊下載大型文件(大約1GB),但我一直沒有成功。我一直在尋找類似的線程,但是我只找到那些可以公開訪問的文件,並且不需要登錄即可獲得文件。但是這不是我的情況,因爲該文件位於專用部分,我需要在下載之前登錄。這是我迄今爲止所做的。如何在Python 2中下載大文件
import mechanize
g_form_id = ""
def is_form_found(form1):
return "id" in form1.attrs and form1.attrs['id'] == g_form_id
def select_form_with_id_using_br(br1, id1):
global g_form_id
g_form_id = id1
try:
br1.select_form(predicate=is_form_found)
except mechanize.FormNotFoundError:
print "form not found, id: " + g_form_id
exit()
url_to_login = "https://example.com/"
url_to_file = "https://example.com/download/files/filename=fname.exe"
local_filename = "fname.exe"
br = mechanize.Browser()
br.set_handle_robots(False) # ignore robots
br.set_handle_refresh(False) # can sometimes hang without this
br.addheaders = [('User-agent', 'Firefox')]
response = br.open(url_to_login)
# Find login form
select_form_with_id_using_br(br, 'login-form')
# Fill in data
br.form['email'] = '[email protected]'
br.form['password'] = 'password'
br.set_all_readonly(False) # allow everything to be written to
br.submit()
# Try to download file
br.retrieve(url_to_file, local_filename)
但我發現了一個錯誤,當512MB下載:
Traceback (most recent call last):
File "dl.py", line 34, in <module>
br.retrieve(br.retrieve(url_to_file, local_filename)
File "C:\Python27\lib\site-packages\mechanize\_opener.py", line 277, in retrieve
block = fp.read(bs)
File "C:\Python27\lib\site-packages\mechanize\_response.py", line 199, in read
self.__cache.write(data)
MemoryError: out of memory
你有什麼想法如何解決這個問題? 謝謝
也許嘗試'請求'。 – user3041764
你必須使用機械化嗎? –
不,只要文件下載完成,我不在乎它的完成方式。但是日誌記錄部分存在問題。如果有另一個模塊可以做到這一點,我願意接受。 –