2012-01-10 53 views
1

我正在維護一個定期得到響應的django項目。到目前爲止,我通過持續監視應用程序並在必要時重新啓動apache來處理這種情況。如何檢測django應用程序中的死鎖(並刪除它們)

如何反應遲鈍?這意味着apache不會回覆任何請求了。

環境:

  • OS:Debian的擠壓64位
  • Web服務器:Apache的2.2.16的mod_wsgi(mod_python的是在生產了大約一年)
  • Django的:1.3.1(和每一個主要的自從1.0版本)
  • 的Python:2.6.6 +的virtualenv(使用分配,無站點包,幾個不同的設置是在以前生產)
  • 數據庫後端:psycopg2 2.3.2
  • 數據庫:PostgreSQL的9.0
  • 連接池(版本8.3在過去使用的):pgbouncer(剩下的問題,如果不使用搖椅)
  • 反向代理:nginx的1.0.11

我能做些什麼來接近錯誤的根源? (我不能提供的源代碼 - 片段在這裏,有可能雖然) 我已經追捕這個問題很久,以至於不可能列出所有我嘗試過的東西。我試圖擺脫我能想到的任何'魔術'。自問題發生以來,應用程序的幾個部分已被重寫。

對於缺乏細節我很遺憾,但我會很樂意提供(幾乎)任何要求的信息,並承諾盡我所能使這篇文章對其他面臨類似問題的人有幫助。

+0

你在做什麼樣的監控? Munin,Monit,Nagios? – 2012-01-10 15:08:28

+0

相關監視通過一個shell腳本完成,該腳本每30秒檢查一次服務器狀態和靜態頁面。我也有運營統計的munin(請求數量等)和nagios來監控一些其他所需的資源。 – tback 2012-01-10 15:25:10

+0

你在settings.py中有'DEBUG = False'對嗎? – danodonovan 2012-01-10 15:34:53

回答

2

最終,您需要添加到mod_wsgi 4.0的新功能。這些將允許守護進程模式更好地控制請求阻塞時的自動重啓。在阻塞條件下重新啓動時,mod_wsgi會嘗試轉儲Python堆棧跟蹤,以瞭解當前每個Python請求線程正在執行的操作,以便了解爲什麼它們被阻止。

建議您在mod_wsgi郵件列表上考慮問題,並在需要時更詳細地解釋新功能。已經公佈之前關於它在:4.0的代碼只能從此時源代碼庫

http://groups.google.com/group/modwsgi/msg/2a968d820e18e97d

的mod_wsgi的。目前的行李箱頭被認爲是穩定的。

1

你可能會被下面的django bug [1]咬傷(它還沒有固定在1。4支)

解決方法:手冊冊應用fix到你的Django源,或使用如下圖所示的WSGI模塊圍繞線程封裝(我們用這個生產系統)

from __future__ import with_statement 
from django.core.handlers.wsgi import WSGIHandler as DjangoWSGIHandler 

from threading import Lock 

__copyright__ = "Jibe" 

class WSGIHandler(DjangoWSGIHandler): 
    """ 
    This provides a threadsafe drop-in replacement of django's WSGIHandler. 

    Initialisation of django via a multithreaded wsgi handler is not safe. 
    It is vulnerable to a A-B B-A deadlock. 

When two threads bootstrap django via different urls you have a change to hit 
the following deadlock. 

    thread 1            thread 2 
    view A             view B 
    import file foo   import lock foo    import file bar import lock bar 
      bootstrap django  lock AppCache.write_lock 
       import file bar import lock bar <-- blocks 
                   bootstrap django lock AppCache.write_lock <----- deadlock 

workaround for an AB BA deadlock: wrap it in a lock C. 

     lock C      lock C 
      lock A      lock B 
      lock B      lock A 
      release B     release A 
      release A     release A 
     release C     release C   

    Thats exactly what this class does, but... only for the first few calls. 
    After that we remove the lock C. as the AppCache.write_lock is only held when django is booted. 

    If we would not remove the lock C after the first few calls, that would make the whole app single threaded again. 

    Usage:  
     in your wsgi file replace the following lines 
       import django.core.handlers.wsgi.WSGIHandler 
       application = django.core.handlers.wsgi.WSGIHandler 
     by 
       import threadsafe_wsgi 
       application = threadsafe_wsgi.WSGIHandler 


    FAQ: 
     Q: why would you want threading in the first place ?     
     A: to reduce memory. Big apps can consume hundeds of megabytes each. adding processes is then much more expensive than threads. 
      that memory is better spend caching, when threads are almost free. 

     Q: this deadlock, it looks far-fetched, is this real ? 
     A: yes we had this problem on production machines. 
    """ 
    __initLock = Lock() # lock C 
    __initialized = 0 

    def __call__(self, environ, start_response): 
     # the first calls (4) we squeeze everybody through lock C 
     # this basically serializes all threads 
     MIN_INIT_CALLS = 4 
     if self.__initialized < MIN_INIT_CALLS: 
      with self.__initLock: 
       ret = DjangoWSGIHandler.__call__(self, environ, start_response) 
       self.__initialized += 1 
       return ret 
     else: 
      # we are safely bootrapped, skip lock C 
      # now we are running multi-threaded again 
      return DjangoWSGIHandler.__call__(self, environ, start_response) 

,並在您wsgi.py使用下面的代碼

from threadsafe_wsgi.handlers import WSGIHandler 
django_handler = WSGIHandler() 

[1] https://code.djangoproject.com/ticket/18251