2009-08-22 234 views
1

我寫在Linux上用C守護進程守護進程。它捕獲信號SIGHUP,SIGTERM,SIGINT和SIGQUIT,使用syslog記錄它們並退出。如果它收到SIGSEGV它的核心轉儲。當這些發生時,所有事情都按預期發生,但偶爾會退出......不會完全退出,不會記錄信號,也不會留下核心轉儲。我很難過,不知道如何調試問題。除了這些信號,什麼方法可以退出?有沒有明顯的答案,我錯過了什麼?您建議如何調試守護進程中看似偶爾發生的其他問題?Debugginng終止意外

回答

3

如果您的守護進程正在使用網絡套接字,它很可能是SIGPIPE - 如果您嘗試寫入由另一端關閉的套接字(或管道),則會得到此信息。請注意,即使您在寫入之前檢查套接字是否可寫,(例如,使用select()),它總是可以在檢查和寫入之間關閉。

+0

啊!我正在使用套接字並且不會陷入SIGPIPE,沒有想到這一點,我敢打賭就是這樣。目前我的select()調用是在循環中斷的,如果它被中斷,但我想留在循環中,如果它是一個SIGPIPE。從你的評論我收集到一個select()調用不會被SIGPIPE中斷,只有read()/ write()調用,是真的嗎? – user19745 2009-08-23 12:13:42

+1

你的過程將不會被'SIGPIPE'從標誌着一個'選擇()',但它會與文件描述符返回被標記爲可讀(這樣就可以發現,它已經關閉)。 'SIGPIPE'只能由'write()'引發。如果你忽略或處理'SIGPIPE','write()'將返回'EPIPE'。 – caf 2009-08-23 12:41:01

2

你可以有守護進程的父呆在身邊,等待它,再有父日誌守護進程退出的原因(即它是否暗示或者退出)。

+0

這是一個很好的解決方案,你怎麼能得到從父這個信息一旦孩子死了? – user19745 2009-08-22 21:59:23

+0

在父級中,調用wait,使用WIFEXITED/WIFSIGNALED來確定發生了什麼,然後用適當的日誌消息調用syslog。檢查手冊頁的等待。 – 2009-08-23 08:11:59

1

好吧,還有很多其他信號會導致它退出,當然包括SIGKILL,這是你無法做的事情。在下面的內容主要從什麼地方man 7 signalActionTermCore(儘管後者至少會留下一個核心轉儲):

First the signals described in the original POSIX.1-1990 standard. 

    Signal  Value  Action Comment 
    ------------------------------------------------------------------------- 
    SIGHUP  1  Term Hangup detected on controlling terminal 
           or death of controlling process 
    SIGINT  2  Term Interrupt from keyboard 
    SIGQUIT  3  Core Quit from keyboard 
    SIGILL  4  Core Illegal Instruction 

    SIGABRT  6  Core Abort signal from abort(3) 
    SIGFPE  8  Core Floating point exception 
    SIGKILL  9  Term Kill signal 
    SIGSEGV  11  Core Invalid memory reference 
    SIGPIPE  13  Term Broken pipe: write to pipe with no readers 
    SIGALRM  14  Term Timer signal from alarm(2) 
    SIGTERM  15  Term Termination signal 
    SIGUSR1 30,10,16 Term User-defined signal 1 
    SIGUSR2 31,12,17 Term User-defined signal 2 
    SIGCHLD 20,17,18 Ign  Child stopped or terminated 
    SIGCONT 19,18,25 Cont Continue if stopped 
    SIGSTOP 17,19,23 Stop Stop process 
    SIGTSTP 18,20,24 Stop Stop typed at tty 
    SIGTTIN 21,21,26 Stop tty input for background process 
    SIGTTOU 22,22,27 Stop tty output for background process 

    The signals SIGKILL and SIGSTOP cannot be caught, blocked, or ignored. 

    Next the signals not in the POSIX.1-1990 standard but described in SUSv2 and POSIX.1-2001. 

    Signal  Value  Action Comment 
    ------------------------------------------------------------------------- 
    SIGBUS  10,7,10  Core Bus error (bad memory access) 
    SIGPOLL     Term Pollable event (Sys V). Synonym of SIGIO 
    SIGPROF  27,27,29 Term Profiling timer expired 
    SIGSYS  12,-,12  Core Bad argument to routine (SVr4) 
    SIGTRAP  5  Core Trace/breakpoint trap 
    SIGURG  16,23,21 Ign  Urgent condition on socket (4.2BSD) 
    SIGVTALRM 26,26,28 Term Virtual alarm clock (4.2BSD) 
    SIGXCPU  24,24,30 Core CPU time limit exceeded (4.2BSD) 
    SIGXFSZ  25,25,31 Core File size limit exceeded (4.2BSD) 

    Up to and including Linux 2.2, the default behaviour for SIGSYS, SIGXCPU, SIGXFSZ, and (on architectures other than SPARC 
    and MIPS) SIGBUS was to terminate the process (without a core dump). (On some other Unices the default action for SIGX- 
    CPU and SIGXFSZ is to terminate the process without a core dump.) Linux 2.4 conforms to the POSIX.1-2001 requirements 
    for these signals, terminating the process with a core dump. 

    Next various other signals. 

    Signal  Value  Action Comment 
    -------------------------------------------------------------------- 
    SIGIOT   6  Core IOT trap. A synonym for SIGABRT 
    SIGEMT  7,-,7  Term 
    SIGSTKFLT -,16,-  Term Stack fault on coprocessor (unused) 
    SIGIO  23,29,22 Term I/O now possible (4.2BSD) 
    SIGCLD  -,-,18  Ign  A synonym for SIGCHLD 
    SIGPWR  29,30,19 Term Power failure (System V) 
    SIGINFO  29,-,-    A synonym for SIGPWR 
    SIGLOST  -,-,-  Term File lock lost 
    SIGWINCH 28,28,20 Ign  Window resize signal (4.3BSD, Sun) 
    SIGUNUSED -,31,-  Term Unused signal (will be SIGSYS) 
2

GDB和它

gdb -p <pid>
確保您使用-g標誌編譯並在退出後立即回溯。 祝你好運!

+0

我不知道你能做到這一點!這很好,因爲守護進程在沒有物理訪問權限的服務器上運行。我正在使用我的筆記本電腦定期進行移動,並且無法保持開放式終端在移動時監控它。這樣我就可以在不關閉守護進程的情況下附加/分離gdb;優秀! – user19745 2009-08-22 22:09:49

1

一個shell包裝可以趕上你的守護進程的退出狀態。下面是它如何工作的:

$ ./waitstatus true 
pid 1512: exit status 0 (success) 

$ ./waitstatus false 
pid 1514: exit status 1 (abnormal) 

$ ./waitstatus perl -e 'exit 21' 
pid 1518: exit status 21 (abnormal) 

$ ./waitstatus perl -e 'kill TERM => $$' 
pid 1520: terminated on signal 15 

$ ./waitstatus no-such-command 
pid 1522: command not found: no-such-command 

$ ./waitstatus /sbin/EACCES.contrived 
pid 1524: command not executable: /sbin/EACCES.contrived 

...這是它是如何實現的:

$ cat ./waitstatus 
#! /bin/bash 

"[email protected]" & 
PID=$! 

wait $PID 
STATUS=$? 

if [ $STATUS -gt 128 ]; then 
    MSG="terminated on signal $(($STATUS - 128))"; 
else 
    case $STATUS in 
    0) 
     MSG="exit status 0 (success)" 
     ;; 
    127) 
     MSG="command not found: $1" 
     ;; 
    126) 
     MSG="command not executable: $1" 
     ;; 
    *) 
     MSG="exit status $STATUS (abnormal)" 
     ;; 
    esac 
fi 

echo "pid $PID: $MSG" 
exit $STATUS 

您可能要到最後echo行更改系統的logger命令的調用,例如,將狀態消息指向系統日誌