2012-01-09 40 views
2

stimecstime in /proc/pid/stat文件太大了,沒有任何意義。但有時只有一些進程有這個錯誤cstime。正如以下幾點:/proc/pid/stat文件中的cstime錯誤

# ps -eo pid,ppid,stime,etime,time,%cpu,%mem,command |grep nsc 
4815  1 Jan08 1-01:20:02 213503-23:34:33 20226149 0.1 /usr/sbin/nscd 
# 
# cat /proc/4815/stat 
4815 (nscd) S 1 4815 4815 0 -1 4202560 2904 0 0 0 21 1844674407359 0 0 20 0 9 0 4021 241668096 326 18446744073709551615 139782748139520 139782748261700 140737353849984 140737353844496 139782734487251 0 0 3674112 16390 18446744073709551615 0 0 17 1 0 0 0 0 0 

可以看到PROC 4815,nscdstime,是1844674407359,等於213503-23:34:33,但剛剛運行了1-01:20:02

另一個問題過程中有錯誤cstime是以下:

一個bash叉SH,其中叉睡眠。

8155 (bash) S 3124 8155 8155 0 -1 4202752 1277 6738 0 0 3 0 4 1844674407368 20 0 1 0 1738175 13258752 451 18446744073709551615 4194304 4757932 140736528897536 140736528896544 47722675403157 0 65536 4100 65538 18446744071562341410 0 0 17 5 0 0 0 0 0 

8184 (sh) S 8155 8155 8155 0 -1 4202496 475 0 0 0 0 0 0 0 20 0 1 0 1738185 11698176 357 18446744073709551615 4194304 4757932 140733266239472 140733266238480 47964680542613 0 65536 4100 65538 18446744071562341410 0 0 17 6 0 0 0 0 0 

8185 (sleep) S 8184 8155 8155 0 -1 4202496 261 0 0 0 0 0 0 0 20 0 1 0 1738186 8577024 177 18446744073709551615 4194304 4212204 140734101195248 140734101194776 48002231427168 0 0 4096 0 0 0 0 17 7 0 0 0 0 0 

所以,你可以看到在PROC的bash是cstime是1844674407368,比其子女的總CPU時間大得多。

我的服務器有一個Intel(R)Xeon(R)CPU E5620 @ 2.40GHz,它是4個內核和8個線程。操作系統是Suse Linux Enterprise Server SP1 x86_64,如下所示

# lsb_release -a 
LSB Version: core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0-x86_64:core-3.2-x86_64:core-4.0-x86_64:desktop-4.0-amd64:desktop-4.0-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics-3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch 
Distributor ID: SUSE LINUX 
Description: SUSE Linux Enterprise Server 11 (x86_64) 
Release: 11 
Codename: n/a 
# 
# uname -a 
Linux node2 2.6.32.12-0.7-xen #1 SMP 2010-05-20 11:14:20 +0200 x86_64 x86_64 x86_64 GNU/Linux 

那麼這是內核的問題嗎?任何人都可以幫助解決它嗎?

回答

3

我懷疑你可能只是看到一個內核錯誤。更新到SLES的最新提供的更新內核(類似於2.6.32.42),看看它是否仍然存在。順便說一句,這是stime,而不是cstime,這是非常高的 - 事實上,看起來很接近,你會發現它是一個值,就像一個字符串截斷18446744073709551615(2^64-1)±幾個時鐘偏移量。

pid_nr: 4815 
tcomm: (nscd) 
state: S 
ppid: 1 
pgid: 4815 
sid: 4815 
tty_nr: 0 
tty_pgrp: -1 
task_flags: 4202560/0x402040 
min_flt: 2904 
cmin_flt: 0 
max_flt: 0 
cmax_flt: 0 
utime: 21 clocks (= 21 clocks) (= 0.210000 s) 
stime: 1844674407359 clocks (= 1844674407359 clocks) (= 18446744073.590000 s) 
cutime: 0 clocks (= 0 clocks) (= 0.000000 s) 
cstime: 0 clocks (= 0 clocks) (= 0.000000 s) 
priority: 20 
nice: 0 
num_threads: 9 
always-zero: 0 
start_time: 4021 
vsize: 241668096 
get_mm_rss: 326 
rsslim: 18446744073709551615/0xffffffffffffffff 
mm_start_code: 139782748139520/0x7f21b50c7000 
mm_end_code: 139782748261700/0x7f21b50e4d44 
mm_start_stack: 140737353849984/0x7ffff7fb9c80 
esp: 140737353844496/0x7ffff7fb8710 
eip: 139782734487251/0x7f21b43c1ed3 
obsolete-pending-signals: 0/0x0 
obsolete-blocked-signals: 0/0x0 
obsolete-sigign: 3674112/0x381000 
obsolete-sigcatch: 16390/0x4006 
wchan: 18446744073709551615/0xffffffffffffffff 
always-zero: 0 
always-zero: 0 
task_exit_signal: 17 
task_cpu: 1 
task_rt_priority: 0 
task_policy: 0 
delayacct_blkio_ticks: 0 
gtime: 0 clocks (= 0 clocks) (= 0.000000 s) 
cgtime: 0 clocks (= 0 clocks) (= 0.000000 s) 
+0

謝謝@jørgensen是的,1844674407359以前的過程4815是stime,但我也注意到異常高的cstime。 – ShemLiang 2012-01-10 01:44:44