2011-08-10 106 views

回答

3

這是我目前使用的代碼片段。

import mimetypes 
import os 
import urllib2 
import urlparse 

def filename_from_url(url): 
    return os.path.basename(urlparse.urlsplit(url)[2]) 

def download_file(url): 
    """Create an urllib2 request and return the request plus some useful info""" 
    name = filename_from_url(url) 
    r = urllib2.urlopen(urllib2.Request(url)) 
    info = r.info() 
    if 'Content-Disposition' in info: 
     # If the response has Content-Disposition, we take filename from it 
     name = info['Content-Disposition'].split('filename=')[1] 
     if name[0] == '"' or name[0] == "'": 
      name = name[1:-1] 
    elif r.geturl() != url: 
     # if we were redirected, take the filename from the final url 
     name = filename_from_url(r.geturl()) 
    content_type = None 
    if 'Content-Type' in info: 
     content_type = info['Content-Type'].split(';')[0] 
    # Try to guess missing info 
    if not name and not content_type: 
     name = 'unknown' 
    elif not name: 
     name = 'unknown' + mimetypes.guess_extension(content_type) or '' 
    elif not content_type: 
     content_type = mimetypes.guess_type(name)[0] 
    return r, name, content_type 

用法:

req, filename, content_type = download_file('http://some.url') 

然後可以使用req作爲類文件對象和例如使用shutil.copyfileobj()將文件內容複製到本地文件中。如果MIME類型無關緊要,只需刪除該部分代碼即可。

既然你似乎是懶惰的,這裏的代碼直接下載文件到本地文件:

import shutil 
def download_file_locally(url, dest): 
    req, filename, content_type = download_file(url)   
    if dest.endswith('/'): 
     dest = os.path.join(dest, filename) 
    with open(dest, 'wb') as f: 
     shutil.copyfileobj(req, f) 
    req.close() 

如果指定了結尾的路徑這種方法是足夠聰明的使用由服務器發送的文件名斜線,否則它使用您指定的目的地。從文檔

+0

您可以提供更簡單的選項嗎? – Zygimantas

+0

認真嗎?有什麼比使用複製和粘貼進行單線函數調用更簡單? – ThiefMaster

+0

我在哪裏輸入文件名? – Zygimantas

1

使用ftplib

代碼示例:

>>> from ftplib import FTP 
>>> ftp = FTP('ftp.cwi.nl') # connect to host, default port 
>>> ftp.login()    # user anonymous, passwd [email protected] 
>>> ftp.retrlines('LIST')  # list directory contents 
total 24418 
drwxrwsr-x 5 ftp-usr pdmaint  1536 Mar 20 09:48 . 
dr-xr-srwt 105 ftp-usr pdmaint  1536 Mar 21 14:32 .. 
-rw-r--r-- 1 ftp-usr pdmaint  5305 Mar 20 09:48 INDEX 
. 
. 
. 
>>> ftp.retrbinary('RETR README', open('README', 'wb').write) 
'226 Transfer complete.' 
>>> ftp.quit() 
6
from urllib2 import urlopen 
req = urlopen('ftp://ftp.gnu.org/README') 

然後你可以用req.read()對文件內容加載到一個變量或做別的事的,或者shutil.copyfileobj保存將內容加載到磁盤而不將其加載到內存。

+0

+1。簡短,甜蜜和工作。 – noiv

0
from urllib.request import urlopen 
try: 
    req = urlopen('ftp://ftp.expasy.org/databases/enzyme/enzclass.txt') 
except: 
    print ("Error")