[Lustre-discuss] System Deadlock

Wed Aug 17 13:23:00 PDT 2011

Hi,

I am in the process of porting Lustre client 1.8.4 to a recent kernel,
2.6.38.8.  This has been a challenge for a variety of reasons, such as
the dcache_lock being removed from the kernel.  I am pretty close to
having it working, but I still can generate system lockup. So far, I
have only tested Lustre over TCP over ethernet.

Using magic-sysrq, I was able to get a call trace.  But, I am not sure
if I fully understand it.  Can someone please validate my analysis?  

Here is one of the running threads:

rm              R  running task        0  2039   2030 0x00000088

 ffff88011cfbb578 ffffffffa059c8df 0000000000000000 0000000000000000

 ffff88011caf8880 000000000001ce7d 00000000000000c1 0000000000000000

 ffff88011cfbb548 ffffffffa06bfe34 0000000000000000 ffff88011ce9c800

Call Trace:

 [<ffffffffa059c8df>] ? LNetMDUnlink+0x6f/0x110 [lnet]

 [<ffffffffa06bfe34>] ? lustre_msg_get_slv+0x94/0x100 [ptlrpc]

 [<ffffffff8104ce53>] ? __wake_up+0x53/0x70

 [<ffffffffa0698c29>] ? ldlm_completion_ast+0x349/0x8d0 [ptlrpc]

 [<ffffffffa067af48>] ? ldlm_lock_enqueue+0x228/0xbb0 [ptlrpc]

 [<ffffffffa0675148>] ? lock_res_and_lock+0x58/0xe0 [ptlrpc]

 [<ffffffffa067b9fd>] ? ldlm_lock_change_resource+0x12d/0x3f0 [ptlrpc]

 [<ffffffffa067e6f9>] ? ldlm_resource_get+0xe9/0xc00 [ptlrpc]

 [<ffffffffa067e1c3>] ? ldlm_resource_putref+0x73/0x430 [ptlrpc]

 [<ffffffffa067c7b3>] ? ldlm_lock_match+0x273/0x8f0 [ptlrpc]

 [<ffffffff8116b342>] ? find_inode+0x62/0xb0

 [<ffffffffa097dea2>] ? ll_update_inode+0x3a2/0x1140 [lustre]

 [<ffffffffa099d150>] ? fid_test_inode+0x0/0x80 [lustre]

 [<ffffffff8116c766>] ? ifind+0x66/0xc0

 [<ffffffffa099d150>] ? fid_test_inode+0x0/0x80 [lustre]

 [<ffffffffa06764d1>] ? ldlm_lock_add_to_lru_nolock+0x51/0xe0 [ptlrpc]

 [<ffffffffa0676846>] ? ldlm_lock_add_to_lru+0x46/0x110 [ptlrpc]

 [<ffffffffa067e6f9>] ? ldlm_resource_get+0xe9/0xc00 [ptlrpc]

 [<ffffffffa067664d>] ? ldlm_lock_remove_from_lru_nolock+0x3d/0xe0
[ptlrpc]

 [<ffffffffa0676951>] ? ldlm_lock_remove_from_lru+0x41/0x110 [ptlrpc]

 [<ffffffffa067e1c3>] ? ldlm_resource_putref+0x73/0x430 [ptlrpc]

 [<ffffffffa0676a41>] ? ldlm_lock_addref_internal_nolock+0x21/0xa0
[ptlrpc]

 [<ffffffffa0677866>] ? search_queue+0xc6/0x170 [ptlrpc]

 [<ffffffffa067c62a>] ? ldlm_lock_match+0xea/0x8f0 [ptlrpc]

 [<ffffffffa043369e>] ? cfs_free+0xe/0x10 [libcfs]

 [<ffffffffa06aedad>] ? __ptlrpc_req_finished+0x59d/0xb30 [ptlrpc]

 [<ffffffffa09579f0>] ? ll_lookup_finish_locks+0x80/0x140 [lustre]

 [<ffffffffa06764d1>] ? ldlm_lock_add_to_lru_nolock+0x51/0xe0 [ptlrpc]

 [<ffffffffa0676846>] ? ldlm_lock_add_to_lru+0x46/0x110 [ptlrpc]

 [<ffffffffa067bfb0>] ? ldlm_lock_decref_internal+0x2f0/0x880 [ptlrpc]

 [<ffffffffa0678b8f>] ? __ldlm_handle2lock+0x9f/0x3d0 [ptlrpc]

 [<ffffffffa067cfa1>] ? ldlm_lock_decref+0x41/0xb0 [ptlrpc]

 [<ffffffffa08e92f4>] ? mdc_set_lock_data+0xd4/0x270 [mdc]

 [<ffffffffa0957950>] ? ll_intent_drop_lock+0xa0/0xc0 [lustre]

 [<ffffffff8116966a>] ? d_kill+0xaa/0x110

 [<ffffffffa09579f0>] ? ll_lookup_finish_locks+0x80/0x140 [lustre]

 [<ffffffffa099d7ee>] ? ll_prepare_mdc_op_data+0xbe/0x120 [lustre]

 [<ffffffffa04390c3>] ? ts_kernel_list_record_file_line+0x123/0x3c0
[libcfs]

 [<ffffffffa0963f41>] ? __ll_inode_revalidate_it+0x191/0x6e0 [lustre]

 [<ffffffffa099d990>] ? ll_mdc_blocking_ast+0x0/0x890 [lustre]

 [<ffffffff8116a03b>] ? dput+0x9b/0x190

 [<ffffffff8108854f>] ? up+0x2f/0x50

 [<ffffffff814abb8e>] ? common_interrupt+0xe/0x13

 [<ffffffffa099c9bb>] ? ll_stats_ops_tally+0x6b/0xd0 [lustre]

 [<ffffffff8116fb1f>] ? mntput_no_expire+0x4f/0x1c0

 [<ffffffff8116fcad>] ? mntput+0x1d/0x30

 [<ffffffff8115d8b2>] ? path_put+0x22/0x30

 [<ffffffff81157b23>] ? vfs_fstatat+0x73/0x80

 [<ffffffff81157b54>] ? sys_newfstatat+0x24/0x50

 [<ffffffff8100bf82>] ? system_call_fastpath+0x16/0x1b

Am I to understand that mntput_no_expire was running, then an interrupt
was called (i.e. common_interrupt).  This then made some calls,
including a call to ll_mdc_blocking_ast.  Is that right?

Is ll_mdc_blocking_ast supposed to run under an interrupt ???

Look how deep the stack goes on after that.  No wonder there is a
lockup!  Normally, I would expect an interrupt service routine to do
something quickly, and then start a worker thread if something more
substantial needed to be done.

Perhaps I am completely misunderstanding this stack trace.  Can someone
please advise me?

Also, how do I interpret a line like this:

[<ffffffffa099d990>] ? ll_mdc_blocking_ast+0x0/0x890 [lustre]

Why is there a question mark?  What are the two addresses separated by
the slash?

Thanks.

Roger Spellman

Staff Engineer

Terascala, Inc.

508-588-1501

www.terascala.com http://www.terascala.com/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20110817/c4656d33/attachment.htm>