2014-10-29 55 views
1

每隔一段時間(一次幾個小時)gunicorn工人失敗,出現以下錯誤:Gunicorn工人定期崩潰: '插座未註冊'

[2014-10-29 10:21:54 +0000] [4902] [INFO] Booting worker with pid: 4902 
[2014-10-29 13:15:24 +0000] [4902] [ERROR] Exception in worker process: 
Traceback (most recent call last): 
    File "/opt/test/env/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 507, in spawn_worker 
    worker.init_process() 
    File "/opt/test/env/local/lib/python2.7/site-packages/gunicorn/workers/gthread.py", line 109, in init_process 
    super(ThreadWorker, self).init_process() 
    File "/opt/test/env/local/lib/python2.7/site-packages/gunicorn/workers/base.py", line 120, in init_process 
    self.run() 
    File "/opt/test/env/local/lib/python2.7/site-packages/gunicorn/workers/gthread.py", line 177, in run 
    self.murder_keepalived() 
    File "/opt/test/env/local/lib/python2.7/site-packages/gunicorn/workers/gthread.py", line 149, in murder_keepalived 
    self.poller.unregister(conn.sock) 
    File "/opt/test/env/local/lib/python2.7/site-packages/trollius/selectors.py", line 408, in unregister 
    key = super(EpollSelector, self).unregister(fileobj) 
    File "/opt/test/env/local/lib/python2.7/site-packages/trollius/selectors.py", line 243, in unregister 
    raise KeyError("{0!r} is not registered".format(fileobj)) 
KeyError: '<socket._socketobject object at 0x7f823f454d70> is not registered' 
... 
... 
[2014-10-29 13:15:24 +0000] [4902] [INFO] Worker exiting (pid: 4902) 
[2014-10-29 13:15:24 +0000] [5809] [INFO] Booting worker with pid: 5809 
... 

配置:

bind = '0.0.0.0:80' 
workers = 1 
threads = 4 
debug = True 
reload = True 
daemon = True 

我使用:

Python 2.7.6 
gunicorn==19.1.1 
trollius==1.0.2 
futures==2.2.0 

任何想法可能是什麼原因,以及如何解決這個問題?

謝謝!

+0

任何運氣嗎?我面對完全相似的情況! – Richeek 2015-05-30 15:30:08

+0

nope,仍然在等待社區的幫助。 – 2015-05-31 09:31:32

+1

我不確定,因爲我必須調查更多,但我認爲它可能與套接字在可以未註冊之前關閉有關。我打算增加優雅的超時時間,看看會發生什麼。將在這裏更新:) – Richeek 2015-06-01 15:33:02

回答

0

我面臨類似的問題,我得到了從gunicorn工作人員的時間錯誤。我正在使用同步工作者,並且有timeoutkeepalive的默認設置。 在我的使用案例中,我的http請求需要很長時間才能完成,因此同步工作人員超時。我使用curl作爲發送HTTP-1.1請求的http客戶端。我將超時時間增加到了一個瘋狂的高數值3600即1小時,這是有效的。然而,在服務器錯誤日誌中,我看到了和你一樣的錯誤。這是我對這個錯誤的假設。 由於默認情況下,所有http 1.1請求都是持久性服務器,因此嘗試通過將其重新放回隊列但不超過keepalive超時重新使用連接。因此,當keepalive超時發生時,它將註銷套接字,以便它不能被重用並關閉它。現在,由於我的超時值非常高,服務器嘗試多次註銷一個已註銷的套接字,但keepalive仍然默認爲5秒,因此出錯。因此,我增加了「Keepalive value as well to 3600」。到目前爲止它工作。

# http://gunicorn-docs.readthedocs.org/en/latest/settings.html 
timeout = 3600 # one hour timeout for long running jobs 
keepalive = 3600