如何檢測django應用程序中的死鎖（並刪除它們）

我正在維護一個定期得到響應的django項目。到目前爲止，我通過持續監視應用程序並在必要時重新啓動apache來處理這種情況。如何檢測django應用程序中的死鎖（並刪除它們）

如何反應遲鈍？這意味着apache不會回覆任何請求了。

環境：

OS：Debian的擠壓64位
Web服務器：Apache的2.2.16的mod_wsgi（mod_python的是在生產了大約一年）
Django的：1.3.1（和每一個主要的自從1.0版本）
的Python：2.6.6 +的virtualenv（使用分配，無站點包，幾個不同的設置是在以前生產）
數據庫後端：psycopg2 2.3.2
數據庫：PostgreSQL的9.0
連接池（版本8.3在過去使用的）：pgbouncer（剩下的問題，如果不使用搖椅）
反向代理：nginx的1.0.11

我能做些什麼來接近錯誤的根源？（我不能提供的源代碼 - 片段在這裏，有可能雖然）我已經追捕這個問題很久，以至於不可能列出所有我嘗試過的東西。我試圖擺脫我能想到的任何'魔術'。自問題發生以來，應用程序的幾個部分已被重寫。

對於缺乏細節我很遺憾，但我會很樂意提供（幾乎）任何要求的信息，並承諾盡我所能使這篇文章對其他面臨類似問題的人有幫助。

來源

2012-01-10 tback

你在做什麼樣的監控？ Munin，Monit，Nagios？ – 2012-01-10 15:08:28

相關監視通過一個shell腳本完成，該腳本每30秒檢查一次服務器狀態和靜態頁面。我也有運營統計的munin（請求數量等）和nagios來監控一些其他所需的資源。 – tback 2012-01-10 15:25:10

你在settings.py中有'DEBUG = False'對嗎？ – danodonovan 2012-01-10 15:34:53

最終，您需要添加到mod_wsgi 4.0的新功能。這些將允許守護進程模式更好地控制請求阻塞時的自動重啓。在阻塞條件下重新啓動時，mod_wsgi會嘗試轉儲Python堆棧跟蹤，以瞭解當前每個Python請求線程正在執行的操作，以便了解爲什麼它們被阻止。

建議您在mod_wsgi郵件列表上考慮問題，並在需要時更詳細地解釋新功能。已經公佈之前關於它在：4.0的代碼只能從此時源代碼庫

http://groups.google.com/group/modwsgi/msg/2a968d820e18e97d

的mod_wsgi的。目前的行李箱頭被認爲是穩定的。

來源

2012-01-10 22:53:57

你可能會被下面的django bug [1]咬傷（它還沒有固定在1。4支）

解決方法：手冊冊應用fix到你的Django源，或使用如下圖所示的WSGI模塊圍繞線程封裝（我們用這個生產系統）

from __future__ import with_statement 
from django.core.handlers.wsgi import WSGIHandler as DjangoWSGIHandler 

from threading import Lock 

__copyright__ = "Jibe" 

class WSGIHandler(DjangoWSGIHandler): 
    """ 
    This provides a threadsafe drop-in replacement of django's WSGIHandler. 

    Initialisation of django via a multithreaded wsgi handler is not safe. 
    It is vulnerable to a A-B B-A deadlock. 

When two threads bootstrap django via different urls you have a change to hit 
the following deadlock. 

    thread 1            thread 2 
    view A             view B 
    import file foo   import lock foo    import file bar import lock bar 
      bootstrap django  lock AppCache.write_lock 
       import file bar import lock bar <-- blocks 
                   bootstrap django lock AppCache.write_lock <----- deadlock 

workaround for an AB BA deadlock: wrap it in a lock C. 

     lock C      lock C 
      lock A      lock B 
      lock B      lock A 
      release B     release A 
      release A     release A 
     release C     release C   

    Thats exactly what this class does, but... only for the first few calls. 
    After that we remove the lock C. as the AppCache.write_lock is only held when django is booted. 

    If we would not remove the lock C after the first few calls, that would make the whole app single threaded again. 

    Usage:  
     in your wsgi file replace the following lines 
       import django.core.handlers.wsgi.WSGIHandler 
       application = django.core.handlers.wsgi.WSGIHandler 
     by 
       import threadsafe_wsgi 
       application = threadsafe_wsgi.WSGIHandler 


    FAQ: 
     Q: why would you want threading in the first place ?     
     A: to reduce memory. Big apps can consume hundeds of megabytes each. adding processes is then much more expensive than threads. 
      that memory is better spend caching, when threads are almost free. 

     Q: this deadlock, it looks far-fetched, is this real ? 
     A: yes we had this problem on production machines. 
    """ 
    __initLock = Lock() # lock C 
    __initialized = 0 

    def __call__(self, environ, start_response): 
     # the first calls (4) we squeeze everybody through lock C 
     # this basically serializes all threads 
     MIN_INIT_CALLS = 4 
     if self.__initialized < MIN_INIT_CALLS: 
      with self.__initLock: 
       ret = DjangoWSGIHandler.__call__(self, environ, start_response) 
       self.__initialized += 1 
       return ret 
     else: 
      # we are safely bootrapped, skip lock C 
      # now we are running multi-threaded again 
      return DjangoWSGIHandler.__call__(self, environ, start_response)

，並在您wsgi.py使用下面的代碼

from threadsafe_wsgi.handlers import WSGIHandler 
django_handler = WSGIHandler()

[1] https://code.djangoproject.com/ticket/18251

來源

2013-03-04 19:55:50 harmv

如何檢測django應用程序中的死鎖（並刪除它們）

回答

相關問題