2014-12-07 279 views
0

從12.04移到Ubuntu 14.04後,我開始遇到很多從S3下載文件的問題。在大約1/20的情況下,boto在拋出異常之前無法下載文件並攔截1-2分鐘。亞馬遜S3 - 博託下載失敗

不適用於非常小的文件,僅適用於中型和大型文件。

我寫了一個簡單的Python腳本來測試這一點:

import datetime 
from boto.s3.connection import S3Connection 

success = 0 
for i in xrange(1000000): 
    try: 
     start = datetime.datetime.now() 
     s3conn = S3Connection(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) 
     bucket = s3conn.get_bucket(bucket_name) 
     key = bucket.get_key(path) 
     content = key.get_contents_as_string() 
     delta = datetime.datetime.now() - start 
     print 'Downloading completed in', delta.total_seconds(), 's, file size is', len(content), 'bytes' 
     success += 1 
     print 'Downloaded', i + 1, 'files, success rate: ', float(success)/(i + 1) 
    except Exception as exc: 
     print 'Error occurred:', exc 

這裏是我的Ubuntu 14.04的機器這個腳本的一些輸出:

Downloading completed in 1.76665 s, file size is 996320 bytes 
Downloaded 1 files, success rate: 1.0 
Downloading completed in 7.709181 s, file size is 996320 bytes 
Downloaded 2 files, success rate: 1.0 
Downloading completed in 1.762192 s, file size is 996320 bytes 
Downloaded 3 files, success rate: 1.0 
Downloading completed in 7.670499 s, file size is 996320 bytes 
Downloaded 4 files, success rate: 1.0 
Downloading completed in 1.806259 s, file size is 996320 bytes 
Downloaded 5 files, success rate: 1.0 
Downloading completed in 1.992967 s, file size is 996320 bytes 
Downloaded 6 files, success rate: 1.0 
... 
... 
... 
Downloading completed in 6.496797 s, file size is 996320 bytes 
Downloaded 21 files, success rate: 1.0 
Error occurred: [Errno 104] Connection reset by peer 
Downloading completed in 2.31506 s, file size is 996320 bytes 
Downloaded 23 files, success rate: 0.95652173913 
Error occurred: The read operation timed out 
Error occurred: The read operation timed out 
Downloading completed in 1.963559 s, file size is 996320 bytes 
Downloaded 26 files, success rate: 0.884615384615 
Downloading completed in 1.395313 s, file size is 996320 bytes 
Downloaded 27 files, success rate: 0.888888888889 
Downloading completed in 1.416122 s, file size is 996320 bytes 
Downloaded 28 files, success rate: 0.892857142857 
Downloading completed in 1.168238 s, file size is 996320 bytes 
Downloaded 29 files, success rate: 0.896551724138 
Downloading completed in 1.30582 s, file size is 996320 bytes 
Downloaded 30 files, success rate: 0.9 

我試圖在Windows和Mac坐在這個腳本在同一個本地網絡,結果是100%的罰款!另外,我在我的12.04 Amazon EC2實例上沒有問題:

... 
Downloading completed in 2.015681 s, file size is 996320 bytes 
Downloaded 100 files, success rate: 1.0 

有沒有人遇到過類似的問題?我在哪裏看?我試圖調試boto庫,但沒有成功。 重要的是,當我在這臺機器上使用其他文件下載方法時,我沒有下載問題,只有boto失敗。 試過不同的博託版本:2.15.0和2.34.0

回答

0

原來這與boto無關,因爲我後來能夠用curl重現它。

通過將數據從歐洲S3區域移動到「美國標準」區域來修復自己的問題,但仍然對如何以這種方式工作感興趣。所有文件在本地網絡中的一臺機器上和另一臺機器上完全下載 - 10-20%的故障。

如果這會讓我更加困擾,請向亞馬遜解決此問題。

0

創建連接時,應該指定區域,否則可能會超時,因爲它可能會嘗試其他區域。

conn = boto.s3.connect_to_region(aws_region, **creds) 

其中aws_region是一個字符串,creds是您的憑據的字典。