[Lustre-discuss] [wc-discuss] System Deadlock

Andreas Dilger adilger at whamcloud.com
Wed Aug 17 21:10:01 PDT 2011


On 2011-08-17, at 2:23 PM, Roger Spellman wrote:
> I am in the process of porting Lustre client 1.8.4 to a recent kernel, 2.6.38.8.

That is somewhat an unfortunate starting point, since 1.8.6 clients at least work with 2.6.32 kernels.

>  This has been a challenge for a variety of reasons, such as the dcache_lock being removed from the kernel.

Yes, the dcache change is a major obstacle to newer kernels.

>  I am pretty close to having it working, but I still can generate system lockup. So far, I have only tested Lustre over TCP over ethernet.

It's difficult to make any kind of assessment without knowing what changes you have made to the client.  It would be useful if you would submit a series of patches so that we can take a look at your patches.

> Using magic-sysrq, I was able to get a call trace.  But, I am not sure if I fully understand it.  Can someone please validate my analysis? 
>  
> Here is one of the running threads:
>  
> rm              R  running task        0  2039   2030 0x00000088
>  ffff88011cfbb578 ffffffffa059c8df 0000000000000000 0000000000000000
>  ffff88011caf8880 000000000001ce7d 00000000000000c1 0000000000000000
>  ffff88011cfbb548 ffffffffa06bfe34 0000000000000000 ffff88011ce9c800
> Call Trace:
>  [<ffffffffa059c8df>] ? LNetMDUnlink+0x6f/0x110 [lnet]
>  [<ffffffffa06bfe34>] ? lustre_msg_get_slv+0x94/0x100 [ptlrpc]
>  [<ffffffff8104ce53>] ? __wake_up+0x53/0x70
>  [<ffffffffa0698c29>] ? ldlm_completion_ast+0x349/0x8d0 [ptlrpc]
>  [<ffffffffa067af48>] ? ldlm_lock_enqueue+0x228/0xbb0 [ptlrpc]
>  [<ffffffffa0675148>] ? lock_res_and_lock+0x58/0xe0 [ptlrpc]
>  [<ffffffffa067b9fd>] ? ldlm_lock_change_resource+0x12d/0x3f0 [ptlrpc]
>  [<ffffffffa067e6f9>] ? ldlm_resource_get+0xe9/0xc00 [ptlrpc]
>  [<ffffffffa067e1c3>] ? ldlm_resource_putref+0x73/0x430 [ptlrpc]
>  [<ffffffffa067c7b3>] ? ldlm_lock_match+0x273/0x8f0 [ptlrpc]
>  [<ffffffff8116b342>] ? find_inode+0x62/0xb0
>  [<ffffffffa097dea2>] ? ll_update_inode+0x3a2/0x1140 [lustre]
>  [<ffffffffa099d150>] ? fid_test_inode+0x0/0x80 [lustre]
>  [<ffffffff8116c766>] ? ifind+0x66/0xc0
>  [<ffffffffa099d150>] ? fid_test_inode+0x0/0x80 [lustre]
>  [<ffffffffa06764d1>] ? ldlm_lock_add_to_lru_nolock+0x51/0xe0 [ptlrpc]
>  [<ffffffffa0676846>] ? ldlm_lock_add_to_lru+0x46/0x110 [ptlrpc]
>  [<ffffffffa067e6f9>] ? ldlm_resource_get+0xe9/0xc00 [ptlrpc]
>  [<ffffffffa067664d>] ? ldlm_lock_remove_from_lru_nolock+0x3d/0xe0 [ptlrpc]
>  [<ffffffffa0676951>] ? ldlm_lock_remove_from_lru+0x41/0x110 [ptlrpc]
>  [<ffffffffa067e1c3>] ? ldlm_resource_putref+0x73/0x430 [ptlrpc]
>  [<ffffffffa0676a41>] ? ldlm_lock_addref_internal_nolock+0x21/0xa0 [ptlrpc]
>  [<ffffffffa0677866>] ? search_queue+0xc6/0x170 [ptlrpc]
>  [<ffffffffa067c62a>] ? ldlm_lock_match+0xea/0x8f0 [ptlrpc]
>  [<ffffffffa043369e>] ? cfs_free+0xe/0x10 [libcfs]
>  [<ffffffffa06aedad>] ? __ptlrpc_req_finished+0x59d/0xb30 [ptlrpc]
>  [<ffffffffa09579f0>] ? ll_lookup_finish_locks+0x80/0x140 [lustre]
>  [<ffffffffa06764d1>] ? ldlm_lock_add_to_lru_nolock+0x51/0xe0 [ptlrpc]
>  [<ffffffffa0676846>] ? ldlm_lock_add_to_lru+0x46/0x110 [ptlrpc]
>  [<ffffffffa067bfb0>] ? ldlm_lock_decref_internal+0x2f0/0x880 [ptlrpc]
>  [<ffffffffa0678b8f>] ? __ldlm_handle2lock+0x9f/0x3d0 [ptlrpc]
>  [<ffffffffa0678b8f>] ? __ldlm_handle2lock+0x9f/0x3d0 [ptlrpc]
>  [<ffffffffa067cfa1>] ? ldlm_lock_decref+0x41/0xb0 [ptlrpc]
>  [<ffffffffa08e92f4>] ? mdc_set_lock_data+0xd4/0x270 [mdc]
>  [<ffffffffa0957950>] ? ll_intent_drop_lock+0xa0/0xc0 [lustre]
>  [<ffffffff8116966a>] ? d_kill+0xaa/0x110
>  [<ffffffffa09579f0>] ? ll_lookup_finish_locks+0x80/0x140 [lustre]
>  [<ffffffffa099d7ee>] ? ll_prepare_mdc_op_data+0xbe/0x120 [lustre]
>  [<ffffffffa04390c3>] ? ts_kernel_list_record_file_line+0x123/0x3c0 [libcfs]
>  [<ffffffffa0963f41>] ? __ll_inode_revalidate_it+0x191/0x6e0 [lustre]
>  [<ffffffffa099d990>] ? ll_mdc_blocking_ast+0x0/0x890 [lustre]
>  [<ffffffff8116a03b>] ? dput+0x9b/0x190
>  [<ffffffff8108854f>] ? up+0x2f/0x50
>  [<ffffffff814abb8e>] ? common_interrupt+0xe/0x13
>  [<ffffffffa099c9bb>] ? ll_stats_ops_tally+0x6b/0xd0 [lustre]
>  [<ffffffff8116fb1f>] ? mntput_no_expire+0x4f/0x1c0
>  [<ffffffff8116fcad>] ? mntput+0x1d/0x30
>  [<ffffffff8115d8b2>] ? path_put+0x22/0x30
>  [<ffffffff81157b23>] ? vfs_fstatat+0x73/0x80
>  [<ffffffff81157b54>] ? sys_newfstatat+0x24/0x50
>  [<ffffffff8100bf82>] ? system_call_fastpath+0x16/0x1b
>  
> Am I to understand that mntput_no_expire was running, then an interrupt was called (i.e. common_interrupt).  This then made some calls, including a call to ll_mdc_blocking_ast.  Is that right?
>  
> Is ll_mdc_blocking_ast supposed to run under an interrupt ???

No, the Linux stack traces are terrible, they just print anything that looks like the address of a kernel or module function.  That includes function addresses that are passed as function parameters, such as callback functions.  It must have hit an interrupt at one point, but I think it is just random garbage on the stack.

> Look how deep the stack goes on after that.  No wonder there is a lockup!  Normally, I would expect an interrupt service routine to do something quickly, and then start a worker thread if something more substantial needed to be done.
>  
> Perhaps I am completely misunderstanding this stack trace.  Can someone please advise me?
>  
> Also, how do I interpret a line like this:
>  
> [<ffffffffa099d990>] ? ll_mdc_blocking_ast+0x0/0x890 [lustre]
> 
> Why is there a question mark?

Not sure.

>  What are the two addresses separated by the slash?

That is the relative location within the function.  In this case I think this is a function passed as a parameter, since it can hardly have called another function at 0 bytes offset.  Another example shows:

> ll_lookup_finish_locks+0x80/0x140

which means it is 0x80 bytes into the 0x140 bytes of this function.  This can be useful with gdb:

# gdb /lib/modules/2.6.32.26-175.fc12.x86_64/updates/kernel/fs/lustre/lustre.ko
gdb> list *(ll_lookup_finish_locks+0x80)
0x1900 is in ll_lookup_finish_locks (/root/rpmbuild/BUILD/lustre-1.8.6/lustre/llite/dcache.c:361)

Cheers, Andreas
--
Andreas Dilger 
Principal Engineer
Whamcloud, Inc.






More information about the lustre-discuss mailing list