2011-11-30 97 views
5

我試圖從我的服務器每天下載一個備份文件到我的本地存儲服務器,但我有一些問題。通過python下載大文件

我寫了這個代碼(去除無用的部分,如電子郵件功能):

import os 
from time import strftime 
from ftplib import FTP 
import smtplib 
from email.MIMEMultipart import MIMEMultipart 
from email.MIMEBase import MIMEBase 
from email.MIMEText import MIMEText 
from email import Encoders 

day = strftime("%d") 
today = strftime("%d-%m-%Y") 

link = FTP(ftphost) 
link.login(passwd = ftp_pass, user = ftp_user) 
link.cwd(file_path) 
link.retrbinary('RETR ' + file_name, open('/var/backups/backup-%s.tgz' % today, 'wb').write) 
link.delete(file_name) #delete the file from online server 
link.close() 
mail(user_mail, "Download database %s" % today, "Database sucessfully downloaded: %s" % file_name) 
exit() 

和我一起像一個crontab運行此:

40 23 * * * python /usr/bin/backup-transfer.py >> /var/log/backup-transfer.log 2>&1 

它適用於小文件,但與備份文件(約1.7Gb)凍結,下載的文件約1.2Gb,然後永遠不會成長(我等了一天),日誌文件是空的。

有什麼想法?

p.s:im使用Python 2.6.5

+0

爲了進一步解決問題,也許你可以使用'FTP.retrbinary'中的'callback'參數來收集更多關於下載進度的信息。另外,使用'maxblocksize'可能會發現一些網絡問題。 – jcollado

回答

6

很抱歉,如果我回答我的問題,但我找到了解決辦法。

我tryed ftputil沒有成功,所以我tryed很多辦法,最後,這個工程:

def ftp_connect(path): 
    link = FTP(host = 'example.com', timeout = 5) #Keep low timeout 
    link.login(passwd = 'ftppass', user = 'ftpuser') 
    debug("%s - Connected to FTP" % strftime("%d-%m-%Y %H.%M")) 
    link.cwd(path) 
    return link 

downloaded = open('/local/path/to/file.tgz', 'wb') 

def debug(txt): 
    print txt 

link = ftp_connect(path) 
file_size = link.size(filename) 

max_attempts = 5 #I dont want death loops. 

while file_size != downloaded.tell(): 
    try: 
     debug("%s while > try, run retrbinary\n" % strftime("%d-%m-%Y %H.%M")) 
     if downloaded.tell() != 0: 
      link.retrbinary('RETR ' + filename, downloaded.write, downloaded.tell()) 
     else: 
      link.retrbinary('RETR ' + filename, downloaded.write) 
    except Exception as myerror: 
     if max_attempts != 0: 
      debug("%s while > except, something going wrong: %s\n \tfile lenght is: %i > %i\n" % 
       (strftime("%d-%m-%Y %H.%M"), myerror, file_size, downloaded.tell()) 
      ) 
      link = ftp_connect(path) 
      max_attempts -= 1 
     else: 
      break 
debug("Done with file, attempt to download m5dsum") 
[...] 

在我的日誌文件,我發現:

01-12-2011 23.30 - Connected to FTP 
01-12-2011 23.30 while > try, run retrbinary 
02-12-2011 00.31 while > except, something going wrong: timed out 
    file lenght is: 1754695793 > 1754695793 
02-12-2011 00.31 - Connected to FTP 
Done with file, attempt to download m5dsum 

可悲的是,我必須重新連接到FTP即使文件已經完全下載,在我的CAS不是問題,因爲我也必須下載md5sum。如您所見,我無法檢測超時並重試連接,但當我超時時,我只需重新連接;如果有人知道如何重新連接而不創建新的ftplib.FTP實例,請告訴我;)

2

您可以嘗試設置超時。從docs

# timeout in seconds 
link = FTP(host=ftp_host, user=ftp_user, passwd=ftp_pass, acct='', timeout=3600)