的Python：如何下載zip文件

我試圖使用此代碼下載zip文件：的Python：如何下載zip文件

o = urllib2.build_opener(urllib2.HTTPCookieProcessor()) 

#login 
p = urllib.urlencode({ usernameField: usernameVal, passField: passVal }) 
f = o.open(authUrl, p) 
data = f.read() 
print data 
f.close() 

#download file 
f = o.open(remoteFileUrl) 
localFile = open(localFile, "wb") 
localFile.write(f.read()) 
f.close()

我得到一些二進制數據，但我「下載」文件的尺寸過小而且不是有效的zip文件。我沒有正確檢索zip文件嗎？下面顯示了f = o.open(remoteFileUrl)的HTTP響應標頭。我不知道是否需要特殊處理，這樣的響應：

HTTP/1.1 200 OK服務器：
Apache的狼/ 1.1附註：私人
的Cache-Control：必須-重新驗證
過期：星期二，1997年12月31日23:59:59 GMT
內容處置：內聯;
filename =「files.zip」;
內容類型：應用程序/壓縮
傳輸編碼：分塊

來源

2010-08-20 zer0stimulus

f.read()不一定讀取整個文件，但只是其中的一個包（這可能是整個文件，如果是小，但將不會用於大文件）。

您需要循環這樣的分組：

while 1: 
    packet = f.read() 
    if not packet: 
     break 
    localFile.write(packet) 
f.close()

f.read()返回一個空包，以表明您已經閱讀整個文件。

來源

2010-08-20 16:57:50 RichieHindle

我會好奇，在文檔中的你發現這個 – 2010-08-20 17:50:03

http://docs.python.org/library/urllib.html#urllib.urlopen：「一個類似文件的對象返回」，然後http://docs.python.org/library/stdtypes.html#file .read – RichieHindle 2010-08-23 08:12:14

真的只是一個包？我在顯示的鏈接處檢查了文檔，並且沒有看到它說任何read（）直到EOF纔讀取。你能解釋更多嗎？ – 2011-06-07 21:48:01

如果你不介意讀取整個壓縮文件到內存中，以最快的方式讀取和寫入其計算方法如下：

data = f.readlines() 
with open(localFile,'wb') as output: 
    output.writelines(data)

否則，閱讀，當你在他們寫塊通過網絡，做

with open(localFile, "wb") as output: 
    chunk = f.read() 
    while chunk: 
     output.write(chunk) 
     chunk = f.read()

這是一個不太整潔，但避免了整個文件一次保存在內存中。希望能幫助到你。

來源

2010-08-20 17:38:06 James

試試這個：

#download file 
f = o.open(remoteFileUrl) 

response = "" 
while 1: 
    data = f.read() 
    if not data: 
     break 
    response += data 

with open(localFile, "wb") as local_file: 
    local_file.write(response)

來源

2010-08-20 17:38:58 leoluk

下面是一個使用的urllib2下載以塊的文件和打印下載的狀態更強大的解決方案

import os 
import urllib2 
import math 

def downloadChunks(url): 
    """Helper to download large files 
     the only arg is a url 
     this file will go to a temp directory 
     the file will also be downloaded 
     in chunks and print out how much remains 
    """ 

    baseFile = os.path.basename(url) 

    #move the file to a more uniq path 
    os.umask(0002) 
    temp_path = "/tmp/" 
    try: 
     file = os.path.join(temp_path,baseFile) 

     req = urllib2.urlopen(url) 
     total_size = int(req.info().getheader('Content-Length').strip()) 
     downloaded = 0 
     CHUNK = 256 * 10240 
     with open(file, 'wb') as fp: 
      while True: 
       chunk = req.read(CHUNK) 
       downloaded += len(chunk) 
       print math.floor((downloaded/total_size) * 100) 
       if not chunk: break 
       fp.write(chunk) 
    except urllib2.HTTPError, e: 
     print "HTTP Error:",e.code , url 
     return False 
    except urllib2.URLError, e: 
     print "URL Error:",e.reason , url 
     return False 

    return file

來源

2011-12-04 18:42:30 Gourneau

只有在處理沒有發送「Content-Lenght」標題的情況下，它纔會有效IMO – 2011-12-22 08:59:01

好點Xavier – Gourneau 2011-12-23 22:15:52

的Python：如何下載zip文件

回答

相關問題