2013-10-15 33 views
8

我有幾個使用boto從Amazon S3讀取多個文件的守護進程。每隔幾天一次,我遇到了一個情況,即從boto內部深處拋出httplib.IncompleteRead。如果我嘗試重試請求,它會立即失敗並顯示另一個IncompleteRead。即使我打電話bucket.connection.close(),所有進一步的請求仍然會出錯。Boto S3偶爾會拋出httplib.IncompleteRead

我覺得我可能已經偶然發現了boto中的一個bug,但似乎沒有人碰到它。難道我做錯了什麼?所有的守護進程都是單線程的,我試過設置is_secure兩種方式。

Traceback (most recent call last): 
    ... 
    File "<file_wrapper.py",> line 22, in next 
    line = self.readline() 
    File "<file_wrapper.py",> line 37, in readline 
    data = self.fh.read(self.buffer_size) 
    File "<virtualenv/lib/python2.6/site-packages/boto/s3/key.py",> line 378, in read 
    self.close() 
    File "<virtualenv/lib/python2.6/site-packages/boto/s3/key.py",> line 349, in close 
    self.resp.read() 
    File "<virtualenv/lib/python2.6/site-packages/boto/connection.py",> line 411, in read 
    self._cached_response = httplib.HTTPResponse.read(self) 
    File "/usr/lib/python2.6/httplib.py", line 529, in read 
    s = self._safe_read(self.length) 
    File "/usr/lib/python2.6/httplib.py", line 621, in _safe_read 
    raise IncompleteRead(''.join(s), amt) 

環境:

  • 亞馬遜EC2
  • 的Ubuntu 11.10
  • 的Python 2.6.7
  • 寶途2.12.0

回答

2

很可能,在博託一個錯誤,但是你描述的症狀並不是獨一無二的。見

IncompleteRead using httplib

​​

由於httplib的出現在你的回溯,一個解決方案是這裏提出:

http://bobrochel.blogspot.in/2010/11/bad-servers-chunked-encoding-and.html?showComment=1358777800048

免責聲明:我與博託沒有經驗。這是基於研究而發佈的,因爲沒有其他答案。

+0

感謝您的努力,一個很好的例子。我還沒有找到一個很好的解決方案,但你應該得到的賞金比誰更;) – shx2

+0

謝謝。:)如果我瞭解更多,我會回覆。 – Glenn

+0

更新:https://groups.google.com/forum/?fromgroups#!topic/boto-users/YiPAOvxIrUY – Glenn

2

我一直在努力解決這個問題一段時間,運行從S3讀取大量數據的長時間運行的進程。我決定在這裏發佈我的解決方案,爲後人。首先,我確信@Glenn指出的破解工作,但我選擇不使用它,因爲我認爲它侵入(hacking httplib)和不安全(它盲目返回它所得到的,即return e.partial,儘管它可能是真正的錯誤情況)。

下面是我終於想出的解決方案,這似乎是工作。

我使用這個通用的重試功能:

import time, logging, httplib, socket 

def run_with_retries(func, num_retries, sleep = None, exception_types = Exception, on_retry = None): 
    for i in range(num_retries): 
     try: 
      return func() # call the function 
     except exception_types, e: 
      # failed on the known exception 
      if i == num_retries - 1: 
       raise # this was the last attempt. reraise 
      logging.warning('operation failed (%s) with error [%s]. will retry %d more times', func, e, num_retries - i - 1) 
      if on_retry is not None: 
       on_retry() 
      if sleep is not None: 
       time.sleep(sleep) 
    assert 0 # should not reach this point 

現在,從S3讀取文件時,我使用這個功能,在IncompleteRead錯誤的情況下,其內部執行重試。出現錯誤之前,請在重試之前撥打key.close(fast = True)

def read_s3_file(key): 
    """ 
    Reads the entire contents of a file on S3. 
    @param key: a boto.s3.key.Key instance 
    """ 
    return run_with_retries(
     key.read, num_retries = 3, sleep = 0.5, 
     exception_types = (httplib.IncompleteRead, socket.error), 
     # close the connection before retrying (fast=True so it doesn't attempt to read remaining) 
     on_retry = lambda: key.close(fast = True) 
    ) 
+1

只是fyi,問題在https://github.com/boto/boto/issues/2204提出 – Glenn

相關問題