2013-05-20 95 views
0

我是Linux內核的一員。我修改了linux內核3.3的調度子模塊,並嘗試在Beagleboard上引導內核。我遇到了「不一致鎖定狀態」的錯誤。任何人都可以幫助我分析以下調試信息?謝謝!鎖定狀態不一致

[ 0.163452] ================================= 
[ 0.167999] [ INFO: inconsistent lock state ] 
[ 0.172576] 3.3.0-rc7-00008-g8bd3d32-dirty #27 Not tainted 
[ 0.178314] --------------------------------- 
[ 0.182891] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. 
[ 0.189178] swapper/0/0 [HC0[0]:SC0[0]:HE1:SE1] takes: 
[ 0.194549] (&rq->lock){?.....}, at: [<c00739b8>] wake_up_new_task+0xb4/0x1d4 
[ 0.202117] {IN-HARDIRQ-W} state was registered at: 
[ 0.207214] [<c008fa7c>] __lock_acquire+0xc4c/0x1e28 
[ 0.212615] [<c00912dc>] lock_acquire+0x98/0x100 
[ 0.217651] [<c04761bc>] _raw_spin_lock+0x2c/0x3c 
[ 0.222778] [<c0072890>] scheduler_tick+0x34/0x134 
[ 0.227996] [<c0052738>] update_process_times+0x58/0x68 
[ 0.233642] [<c00881ac>] tick_periodic+0x48/0xc4 
[ 0.238677] [<c00882c0>] tick_handle_periodic+0x24/0x98 
[ 0.244323] [<c00264a0>] omap2_gp_timer_interrupt+0x24/0x34 
[ 0.250366] [<c00a1344>] handle_irq_event_percpu+0x5c/0x22c 
[ 0.256378] [<c00a1550>] handle_irq_event+0x3c/0x5c 
[ 0.261657] [<c00a39f8>] handle_level_irq+0xac/0x138 
[ 0.267059] [<c00a0b58>] generic_handle_irq+0x30/0x48 
[ 0.272521] [<c0014d18>] handle_IRQ+0x4c/0xac 
[ 0.277313] [<c000872c>] omap3_intc_handle_irq+0x44/0x4c 
[ 0.283050] [<c0476a64>] __irq_svc+0x44/0x60 
[ 0.287719] [<c06286e4>] start_kernel+0x204/0x354 
[ 0.292846] [<80008044>] 0x80008044 
[ 0.296691] irq event stamp: 2802 
[ 0.300201] hardirqs last enabled at (2801): [<c0476348>] _raw_write_unlock_irq+0x24/0x2c 
[ 0.308807] hardirqs last disabled at (2802): [<c0476294>] _raw_spin_lock_irqsave+0x1c/0x58 
[ 0.317504] softirqs last enabled at (2756): [<c004a6dc>] irq_exit+0x94/0x9c 
[ 0.324951] softirqs last disabled at (2751): [<c004a6dc>] irq_exit+0x94/0x9c 
[ 0.332397] 
[ 0.332397] other info that might help us debug this: 
[ 0.339294] Possible unsafe locking scenario: 
[ 0.339294] 
[ 0.345581]  CPU0 
[ 0.348175]  ---- 
[ 0.350769] lock(&rq->lock); 
[ 0.354003] <Interrupt> 
[ 0.356781]  lock(&rq->lock); 
[ 0.360198] 
[ 0.360198] *** DEADLOCK *** 
[ 0.360229] 
[ 0.366577] 2 locks held by swapper/0/0: 
[ 0.370697] #0: (&p->pi_lock){+.....}, at: [<c0073920>] wake_up_new_task+0x1c/0x1d4 
[ 0.378875] #1: (&rq->lock){?.....}, at: [<c00739b8>] wake_up_new_task+0xb4/0x1d4 
[ 0.386871] 
[ 0.386871] stack backtrace: 
[ 0.391571] [<c001b7a8>] (unwind_backtrace+0x0/0xf0) from [<c008e6fc>] (print_usage_bug+0x1d8/0x) 
[ 0.401062] [<c008e6fc>] (print_usage_bug+0x1d8/0x2c0) from [<c008ebac>] (mark_lock+0x3c8/0x64c) 
[ 0.410217] [<c008ebac>] (mark_lock+0x3c8/0x64c) from [<c0091c24>] (mark_held_locks+0xb0/0x144) 
[ 0.419250] [<c0091c24>] (mark_held_locks+0xb0/0x144) from [<c0091d60>] (trace_hardirqs_on_calle) 
[ 0.429565] [<c0091d60>] (trace_hardirqs_on_caller+0xa8/0x19c) from [<c000f284>] (do_vfp+0x8/0x2) 
[ 12.759429] BUG: spinlock lockup on CPU#0, swapper/0/0 
[ 12.764801] lock: c0de0380, .magic: dead4ead, .owner: swapper/0/0, .owner_cpu: 0 
[ 12.772613] [<c001b7a8>] (unwind_backtrace+0x0/0xf0) from [<c0261600>] (do_raw_spin_lock+0xa0/0x) 
[ 12.782135] [<c0261600>] (do_raw_spin_lock+0xa0/0x134) from [<c0072890>] (scheduler_tick+0x34/0x) 
[ 12.791625] [<c0072890>] (scheduler_tick+0x34/0x134) from [<c0052738>] (update_process_times+0x5) 
[ 12.801391] [<c0052738>] (update_process_times+0x58/0x68) from [<c00881ac>] (tick_periodic+0x48/) 
[ 12.811004] [<c00881ac>] (tick_periodic+0x48/0xc4) from [<c00882c0>] (tick_handle_periodic+0x24/) 
[ 12.820587] [<c00882c0>] (tick_handle_periodic+0x24/0x98) from [<c00264a0>] (omap2_gp_timer_inte) 
[ 12.831176] [<c00264a0>] (omap2_gp_timer_interrupt+0x24/0x34) from [<c00a1344>] (handle_irq_even) 
[ 12.842102] [<c00a1344>] (handle_irq_event_percpu+0x5c/0x22c) from [<c00a1550>] (handle_irq_even) 
[ 12.852325] [<c00a1550>] (handle_irq_event+0x3c/0x5c) from [<c00a39f8>] (handle_level_irq+0xac/0) 
[ 12.861907] [<c00a39f8>] (handle_level_irq+0xac/0x138) from [<c00a0b58>] (generic_handle_irq+0x3) 
[ 12.871704] [<c00a0b58>] (generic_handle_irq+0x30/0x48) from [<c0014d18>] (handle_IRQ+0x4c/0xac) 
[ 12.880828] [<c0014d18>] (handle_IRQ+0x4c/0xac) from [<c000872c>] (omap3_intc_handle_irq+0x44/0x) 
[ 12.890258] [<c000872c>] (omap3_intc_handle_irq+0x44/0x4c) from [<c0476a64>] (__irq_svc+0x44/0x6) 
[ 12.899566] Exception stack(0xc0677d98 to 0xc0677de0) 
[ 12.904846] 7d80:              edd47a1a 00000000 
[ 12.913360] 7da0: c0076e30 c0692630 c0693ad8 600001d3 c0676050 00000001 00000a00 c0476acc 
[ 12.921874] 7dc0: c0676000 00000468 c0704b70 c0677de0 c0476ac4 c000f290 60000153 ffffffff 
[ 12.930419] [<c0476a64>] (__irq_svc+0x44/0x60) from [<c000f290>] (do_vfp+0x14/0x20) 
[ 12.938385] [<c000f290>] (do_vfp+0x14/0x20) from [<c0476ac4>] (__und_svc+0x44/0x80) 
[ 12.946380] [<c0476ac4>] (__und_svc+0x44/0x80) from [<c0076e30>] (enqueue_task_fair+0x1dc/0x5e8) 
[ 12.955505] [<c0076e30>] (enqueue_task_fair+0x1dc/0x5e8) from [<c00702c0>] (enqueue_task+0x64/0x) 
[ 12.964935] [<c00702c0>] (enqueue_task+0x64/0x74) from [<c00739e0>] (wake_up_new_task+0xdc/0x1d4) 
[ 12.974182] [<c00739e0>] (wake_up_new_task+0xdc/0x1d4) from [<c0042120>] (do_fork+0xe4/0x328) 
[ 12.983062] [<c0042120>] (do_fork+0xe4/0x328) from [<c0015068>] (kernel_thread+0x6c/0x7c) 
[ 12.991577] [<c0015068>] (kernel_thread+0x6c/0x7c) from [<c0464af4>] (rest_init+0x1c/0xd0) 
[ 13.000183] [<c0464af4>] (rest_init+0x1c/0xd0) from [<c06287c4>] (start_kernel+0x2e4/0x354) 
[ 13.008880] [<c06287c4>] (start_kernel+0x2e4/0x354) from [<80008044>] (0x80008044) 
+1

你對調度模塊做了哪些特定的修改(顯然是涉及'scheduler_tick'中的自旋鎖的東西)? –

回答

2

首先,這是一個lockdep警告。這表明您的內核可能存在不一致的鎖定問題,這可能會導致死鎖。 lockdep創建同一類的鎖的邏輯組,並對它們進行一些規則檢查。如果違反了任何規則,就會發出警告。 lockdep基本上有兩種狀態規則a)單鎖狀態規則; b)多鎖相關性規則。從警告日誌您提供:

[ 0.370697] #0: (&p->pi_lock){+.....}, at: [<c0073920>] wake_up_new_task+0x1c/0x1d4 
[ 0.378875] #1: (&rq->lock){?.....}, at: [<c00739b8>] wake_up_new_task+0xb4/0x1d4 

lockdep把兩個P-> pi_lock和rq->鎖在邏輯上同一個班級,第一鎖定在try_to_wake_up舉行()和後來的鎖定在功能__task_rq_lock舉行()。因此,整個事情變成:

try_to_wake_up() -> p->pi_lock is held 
     ttwu_remote() 
      __task_rq_lock() -> rq->lock is held 

因此,在單個路徑上,將採用兩個相同的類鎖。作爲多鎖相關性規則的一部分,lockdep會對此提出警告。

注:我不確定您所做的修改,所以不確定警告是否爲假陽性。我所嘗試的只是解釋這個問題。

+0

感謝您的分析。我已經在內核空間中啓用了浮點使用,這可能是導致死鎖的一個原因。正如鏈接中所述:http://stackoverflow.com/questions/13886338/use-of-floating-point-in-the-linux-kernel。我猜死鎖是所謂的「內核可陷入其中」的一種類型。 – coder

+0

也許,讓你更清楚你修改了什麼。我試圖給出警告的概述。 –