2013-08-20 22 views
0

我們遇到了WCF服務的問題,我們無法重現。該服務會不時響應客戶的呼叫。在一段時間不活動之後,這種情況在星期一經常發生。WCF應用程序掛起(WinDBG輸出附加)

WCF服務是在Windows服務中自行託管的。實例上下文是每個調用。它使用NetTcpBinding而沒有安全性,整個WCF配置是在代碼中完成的,沒有XML配置。我們已將ServiceThrottle參數設置爲1024,用於會話,呼叫和實例。以下是完整的ServiceHost配置:

 


    ServiceThrottlingBehavior throttle; 
    throttle = _svcHost.Description.Behaviors.Find<ServiceThrottlingBehavior>(); 
    if (throttle == null) 
    { 
     throttle = new ServiceThrottlingBehavior(); 
     throttle.MaxConcurrentCalls = 1024; 
     throttle.MaxConcurrentSessions = 1024; 
     throttle.MaxConcurrentInstances = 1024; 
     _svcHost.Description.Behaviors.Add(throttle); 
    } 

    ... 

    TimeSpan timeout = new TimeSpan(0, 0, 5); 

    NetTcpBinding binding = new NetTcpBinding(SecurityMode.None); 
    binding.OpenTimeout = timeout; 
    binding.CloseTimeout = timeout; 
    binding.ReceiveTimeout = timeout; 
    binding.SendTimeout = timeout; 
    binding.MaxBufferSize = 10485760; 
    binding.MaxReceivedMessageSize = 10485760; 

    XmlDictionaryReaderQuotas quotas = new XmlDictionaryReaderQuotas(); 
    binding.ReaderQuotas = quotas; 
    binding.ReaderQuotas.MaxStringContentLength = 10485760; 
    binding.ReaderQuotas.MaxArrayLength = 10000; 

    binding.Security.Message.ClientCredentialType = MessageCredentialType.None; 

    ... 

    ServiceEndpoint endpoint = _svcHost.AddServiceEndpoint(intfType, binding, serviceBaseAddress + "/" + intfType.Name); 
    endpoint.Behaviors.Add(new ClientTrackerEndpointBehavior()); 

 

的問題顯示了在客戶端拋出的異常。在5到10個客戶端連接到服務,每一個拋出這種類型的異常(甚至一個客戶端在同一臺機器上運行的服務本身):

 


System.ServiceModel.CommunicationException 
    The socket connection was aborted. This could be caused by an error processing your message or a receive timeout being exceeded by the remote host, or an underlying network resource issue. Local socket timeout was '00:00:29.9969997'. 
    An existing connection was forcibly closed by the remote host 
StackTrace: 
    at System.Net.Sockets.Socket.Send(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags) 
    at System.ServiceModel.Channels.SocketConnection.Write(Byte[] buffer, Int32 offset, Int32 size, Boolean immediate, TimeSpan timeout) 

 

的異常被拋出之後,我們嘗試使用telnet手動連接到服務,並且似乎此連接嘗試也被拒絕。由於我們在生產系統上有3位客戶在週一遇到了問題,我們無法附加Visual Studio調試器,因此我們使用WinDBG創建了用戶迷你轉儲以分析問題。我們檢查的第一件事是ServiceThrottle的電流值(這裏只有一個轉儲,但輸出等同於與其他垃圾場產生的輸出):

 


    0:032> !dumpheap -type ServiceThrottle -short 
    01fb65a4 

    0:032> !do 01fb65a4 
    Name:  System.ServiceModel.Dispatcher.ServiceThrottle 
    MethodTable: 70cc56f0 
    EEClass:  70a07ce4 
    Size:  40(0x28) bytes 
    File:  C:\Windows\Microsoft.Net\assembly\GAC_MSIL\System.ServiceModel\v4.0_4.0.0.0__b77a5c561934e089\System.ServiceModel.dll 
    Fields: 
      MT Field Offset     Type VT  Attr Value Name 
    70cc572c 400314f  4 ...cher.FlowThrottle 0 instance 01fb6660 calls 
    70cc572c 4003150  8 ...cher.FlowThrottle 0 instance 01fb6760 sessions 
    71725030 4003151  c ...her.QuotaThrottle 0 instance 00000000 dynamic 
    70cc572c 4003152  10 ...cher.FlowThrottle 0 instance 02013110 instanceContexts 
    70cb653c 4003153  14 ...l.ServiceHostBase 0 instance 01fadd84 host 
    70cc7cf8 4003154  18 ...manceCountersBase 0 instance 02003590 servicePerformanceCounters 
    74246788 4003155  20  System.Boolean 1 instance  1 isActive 
    7423f744 4003156  1c  System.Object 0 instance 01fb65cc thisLock 
    74242ad4 400314d  1134   System.Int32 1 static  128 DefaultMaxConcurrentCallsCpuCount 
    74242ad4 400314e  1138   System.Int32 1 static  800 DefaultMaxConcurrentSessionsCpuCount 

    0:032> !do 01fb6660 
    Name:  System.ServiceModel.Dispatcher.FlowThrottle 
    MethodTable: 70cc572c 
    EEClass:  70a07d24 
    Size:  52(0x34) bytes 
    File:  C:\Windows\Microsoft.Net\assembly\GAC_MSIL\System.ServiceModel\v4.0_4.0.0.0__b77a5c561934e089\System.ServiceModel.dll 
    Fields: 
      MT Field Offset     Type VT  Attr Value Name 
    74242ad4 4002ede  20   System.Int32 1 instance  1024 capacity 
    74242ad4 4002edf  24   System.Int32 1 instance  0 count 
    74246788 4002ee0  2c  System.Boolean 1 instance  0 warningIssued 
    74242ad4 4002ee1  28   System.Int32 1 instance  89 warningRestoreLimit 
    7423f744 4002ee2  4  System.Object 0 instance 01fb6694 mutex 
    74232914 4002ee3  8 ...ding.WaitCallback 0 instance 01fb6640 release 
    00000000 4002ee4  c      0 instance 01fb66a0 waiters 
    7423fb08 4002ee5  10  System.String 0 instance 01fb65d8 propertyName 
    7423fb08 4002ee6  14  System.String 0 instance 01fb660c configName 
    74230f78 4002ee7  18  System.Action 0 instance 02007670 acquired 
    74230f78 4002ee8  1c  System.Action 0 instance 02007690 released 

    0:032> !do 01fb6760 
    Name:  System.ServiceModel.Dispatcher.FlowThrottle 
    MethodTable: 70cc572c 
    EEClass:  70a07d24 
    Size:  52(0x34) bytes 
    File:  C:\Windows\Microsoft.Net\assembly\GAC_MSIL\System.ServiceModel\v4.0_4.0.0.0__b77a5c561934e089\System.ServiceModel.dll 
    Fields: 
      MT Field Offset     Type VT  Attr Value Name 
    74242ad4 4002ede  20   System.Int32 1 instance  1024 capacity 
    74242ad4 4002edf  24   System.Int32 1 instance  0 count 
    74246788 4002ee0  2c  System.Boolean 1 instance  0 warningIssued 
    74242ad4 4002ee1  28   System.Int32 1 instance  560 warningRestoreLimit 
    7423f744 4002ee2  4  System.Object 0 instance 01fb6794 mutex 
    74232914 4002ee3  8 ...ding.WaitCallback 0 instance 01fb6740 release 
    00000000 4002ee4  c      0 instance 01fb67a0 waiters 
    7423fb08 4002ee5  10  System.String 0 instance 01fb66d0 propertyName 
    7423fb08 4002ee6  14  System.String 0 instance 01fb6708 configName 
    74230f78 4002ee7  18  System.Action 0 instance 020076b0 acquired 
    74230f78 4002ee8  1c  System.Action 0 instance 020076d0 released 

所有這些值似乎是罰款。然後我們檢查了線程池和線程:



    0:032> !threadpool 
    CPU utilization: 0% 
    Worker Thread: Total: 1023 Running: 1017 Idle: 6 MaxLimit: 1023 MinLimit: 1000 
    Work Request in Queue: 0 
    -------------------------------------- 
    Number of Timers: 4 
    -------------------------------------- 
    Completion Port Thread:Total: 32 Free: 0 MaxFree: 16 CurrentLimit: 33 MaxLimit: 1000 MinLimit: 1000 

    0:032> !threads -special 
    ThreadCount:  1027 
    UnstartedThread: 997 
    BackgroundThread: 28 
    PendingThread: 997 
    DeadThread:  1 
    Hosted Runtime: no 
             PreEmptive GC Alloc    Lock 
      ID OSID ThreadOBJ State GC   Context  Domain Count APT Exception 
     0 1 1d80 006b6518  a020 Enabled 00000000:00000000 006ac0b0  0 MTA 
     2 2 238c 006c1840  b220 Enabled 00000000:00000000 006ac0b0  0 MTA (Finalizer) 
    XXXX 4  00704f58 1019820 Enabled 00000000:00000000 006ac0b0  0 Ukn (Threadpool Worker) 
     4 5 21a4 00706480 3009220 Enabled 0bedaf78:0bedb5c8 006ac0b0  0 MTA (Threadpool Worker) 
     5 6 e8c 03de9428 100a220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     7 7 634 03e0d318 3009220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     8 8 1d38 03ebeb08 3009220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     9 9 1808 03e4fd70 3009220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     10 a 1c48 03e50d70 3009220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     11 b 1d88 073be2b0 3009220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     12 c 1c74 073bf2b8 3009220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     13 d 1ae4 073c0dc8 3009220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     14 e 1818 073c1598 3009220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     15 f 1a58 073c1fa8 3009220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     16 10 13e0 073c4bb8 3009220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     17 11 1a3c 073c5388 3009220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     18 12 1b5c 03e9ffe0 3009220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     19 13 1b80 03ea04e8 3009220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     20 14 900 03ea09f0 3009220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     21 15 1c84 03ea0ef8 3009220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     22 16 da0 03ea1400 3009220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     23 17 13b0 03ea1908 3009220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     24 18 18cc 03ea1e10 3009220 Enabled 00000000:00000000 006ac0b0  0 MTA (Threadpool Worker) 
     25 1a 1008 03ea2820 3009220 Enabled 0bfe03b8:0bfe21c4 006ac0b0  0 MTA (Threadpool Worker) 
     27 29 bc4 0b041588 1009220 Enabled 0bdd7c3c:0bdd99f8 006ac0b0  0 MTA (Threadpool Worker) 
     28 3b ad8 0af7d740 1009220 Enabled 0bf8c0d8:0bf8c1c4 006ac0b0  0 MTA (Threadpool Worker) 
     29 64 dd8 0ae54890 1009220 Enabled 0bdfbb9c:0bdfd9f8 006ac0b0  0 MTA (Threadpool Worker) 
     30 2f 440 0b03c010 1009220 Enabled 0be03bf0:0be059f8 006ac0b0  0 MTA (Threadpool Worker) 
     31 25 2198 0b03f080 1009220 Enabled 0bd6d5b4:0bd6f410 006ac0b0  0 MTA (Threadpool Worker) 
     32 63 1b9c 0ae41388 1009220 Enabled 0bdb5b14:0bdb79f8 006ac0b0  0 MTA (Threadpool Worker) 
    XXXX 33 2270 0b09cd20  1400 Enabled 00000000:00000000 006ac0b0  0 Ukn 
    XXXX 5b 1554 0ae54388  1400 Enabled 00000000:00000000 006ac0b0  0 MTA 
    XXXX 31 1098 0ae53978  1400 Enabled 00000000:00000000 006ac0b0  0 Ukn 
    XXXX 34 15c 0af7be18  1400 Enabled 00000000:00000000 006ac0b0  0 Ukn 
    ... -> lots of more threads here 
    XXXX 403 24e4 0d85b578  1400 Enabled 00000000:00000000 006ac0b0  0 Ukn 
    XXXX 404 24d8 0d85ba80  1400 Enabled 00000000:00000000 006ac0b0  0 Ukn 
    XXXX 405 24e0 0d85bf88  1400 Enabled 00000000:00000000 006ac0b0  0 Ukn 

      OSID  Special thread type 
     1 a88 DbgHelper 
     2 238c Finalizer 
     4 21a4 ThreadpoolWorker 
     5 e8c Timer 
     7 634 ThreadpoolWorker 
     8 1d38 ThreadpoolWorker 
     9 1808 ThreadpoolWorker 
     10 1c48 ThreadpoolWorker 
     11 1d88 ThreadpoolWorker 
     12 1c74 ThreadpoolWorker 
     13 1ae4 ThreadpoolWorker 
     14 1818 ThreadpoolWorker 
     15 1a58 ThreadpoolWorker 
     16 13e0 ThreadpoolWorker 
     17 1a3c ThreadpoolWorker 
     18 1b5c ThreadpoolWorker 
     19 1b80 ThreadpoolWorker 
     20 900 ThreadpoolWorker 
     21 1c84 ThreadpoolWorker 
     22 da0 ThreadpoolWorker 
     23 13b0 ThreadpoolWorker 
     24 18cc ThreadpoolWorker 
     25 1008 ThreadpoolWorker 
     26 1b08 Gate 
     27 bc4 ThreadpoolWorker 
     28 ad8 ThreadpoolWorker 
     29 dd8 ThreadpoolWorker 
     30 440 ThreadpoolWorker 
     31 2198 ThreadpoolWorker 
     32 1b9c ThreadpoolWorker 

 

當我們看到這麼多的線程時,我們感到震驚。現在我們懷疑我們的問題可能與大量的線程有關。因此,如果任何人都可以回答以下問題,將不勝感激:

  1. 我們的假設是正確的嗎?或者我們是否在調查線索的錯誤軌道上?
  2. !!threadpool命令輸出非常多的正在運行的線程(1027)。看看!線程的輸出,似乎只有28個線程正在工作。這些差異如何解釋?
  3. 我們有非常多的未啓動/掛起線程。未分離線程和待處理線程之間有什麼區別?我們試圖重現這種行爲,但即使在線程池中設置最小和最大線程,我們也不會得到這些高數字。更重要的是,調查在2小時的進程正常運行時間之後創建的轉儲文件,沒有發現未啓動或掛起的線程(此時服務仍在運行)。原始轉儲過程的正常運行時間約爲14天。
  4. 完成端口線程空閒值爲0是什麼意思?
  5. WinDBG中是否有其他方法/命令可用於更好地理解我們的問題?

我們很不滿意我們的軟件的當前狀態,並尋找信息,關於這個話題最經常說,這是一個沒有結束的會話/併發呼叫的問題,但在這之前規定似乎並沒有成爲我們的問題。很感謝任何形式的幫助!

+0

看來我們有類似的問題。 [鏈接](http://stackoverflow.com/questions/36644902/big-number-of-unstarted-threads-in-net-application)。你找到原因了嗎? –

回答

1

我們遇到了一個類似的問題,這個自託管的WCF接口爲異步(2種單向服務調用)後端請求提供了同步請求/響應Web服務。在我們的測試早期,我們注意到在經過一段時間後,我們的服務對新請求沒有響應。經過一番調查後,我們發現無論何時後端服務(我們無法控制)都沒有發送響應,我們會繼續無限期地等待,因此我們保持我們的客戶端連接處於打開狀態。

我們通過提供「等待時間」配置值來解決問題,所以我們肯定會響應客戶端並關閉連接。我們使用類似下面的...

Task processTask = Task.Factory.StartNew(() => Process(message)); 

bool isProcessSuccess = processTask.Wait(shared.ConfigReader.SyncWebServiceWaitTime); 

if (!isProcessSuccess) 
{ 
//handle error … 
} 

下面的鏈接,其中提供了有關WCF服務的性能計數器信息,還可以幫助確定是否如預期般被關閉了電話。 http://blogs.microsoft.co.il/blogs/idof/archive/2011/08/11/wcf-scaling-check-your-counters.aspx