[Lustre-discuss] MDS overload, why?

Arne Brutschy arne.brutschy at ulb.ac.be
Tue Oct 20 07:13:25 PDT 2009


Hello,

thanks for your suggestions. I tried to investigate further the reason
for the MDS overload. It happened 3 times last week, even after I locked
out the user that I thought was causing it.

As suggested, I set up a serial console monitor on the MDS and installed
the lmt (not particularly easy :). This morning, the MDS was overloaded
again. The cpu load spike beyond 150 and the logs where showing the
following errors:

-------8<----------
Oct 20 12:02:38 compute-2-0 kernLustre: 9806:0:(service.c:1317:ptlrpc_server_handle_request()) @@@ Request x64452376 took longer than estimated (100+87s); client may timeout.  req at f6081c00 x64452376/t1397543027 o101->5f1b0ba3-5e5b-d6b0-17a2-8e974cb5df12@:0/0 lens 496/472 e 0 to 0 dl 1256032871 ref 1 fl Complete:/0/0 rc 301/301
el: LustreError: 0:0:(ldlm_lockd.c:305:waiting_locks_callback()) ### lock callback timer expired after 187s: evicting client at 10.255.255.190 at tcp  ns: mds-lustre-MDT0000_UUID lock: f0249540/0xefbdbd73c4253265 lrc: 3/0,0 mode: CR/CR res: 1343024/2832666510 bits 0x3 rrc: 174 type: IBT flags: 0x4000030 remote: 0x73cb76d4b7121d86 expref: 1253 pid: 9807 timeout: 8475134
Lustre: 9747:0:(service.c:1317:ptlrpc_server_handle_request()) @@@ Request x78099102 took longer than estimated (100+92s); client may timeout.  req at f4843a00 x78099102/t0 o101->99653b1f-c07d-b7c3-93fc-11f9ba3630f8@:0/0 lens 432/416 e 0 to 0 dl 1256032871 ref 1 fl Complete:/0/0 rc 0/0
LustreError: 9886:0:(ldlm_lockd.c:1161:ldlm_handle_enqueue()) ### lock on destroyed export f631b000 ns: mds-lustre-MDT0000_UUID lock: d23069c0/0xefbdbd73c4253281 lrc: 3/0,0 mode: CR/CR res: 1343024/2832666510 bits 0x3 rrc: 168 type: IBT flags: 0x4000030 remote: 0x9422ea29787ca6a7 expref: 3 pid: 9886 timeout: 0
Oct 20 12:02:43 compute-2-0 kernel: LustreError: 9886:0:(ldlm_lockd.c:1161:ldlm_handle_enqueue()) ### lock on destroyed export f631b000 ns: mds-lustre-MDT0000_UUID lock: d23069c0/0xefbdbd73c4253281 lrc: 3/0,0 mode: CR/CR res: 1343024/2832666510 bits 0x3 rrc: 168 type: IBT flags: 0x4000030 remote: 0x9422ea29787ca6a7 expref: 3 pid: 9886 timeout: 0
Lustre: 4380:0:(import.c:507:import_select_connection()) lustre-OST0002-osc: tried all connections, increasing latency to 26s
LustreError: 167-0: This client was evicted by lustre-OST0002; in progress operations using this service will fail.
LustreError: 12203:0:(quota_ctl.c:288:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3
Oct 20 12:02:45 LustreError: 12203:0:(quota_ctl.c:288:client_quota_ctl()) Skipped 1 previous similar message
compute-2-0 kernLustreError: 12203:0:(quota_master.c:1661:qmaster_recovery_main()) qmaster recovery failed! (id:552 type:0 rc:-3)
el: LustreError: 167-0: This client was evicted by lustre-OST0002; in progress operations using this service will fail.
Oct 20 12:02:45 compute-2-0 kernel: LustreError: 12203:0:(quota_ctl.c:288:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3
Oct 20 12:02:45 compute-2-0 kernel: LustreError: 12203:0:(quota_ctl.c:288:client_quota_ctl()) Skipped 1 previous similar message
Oct 20 12:02:45 compute-2-0 kernel: LustreError: 12203:0:(quota_master.c:1661:qmaster_recovery_main()) qmaster recovery failed! (id:552 type:0 rc:-3)
LustreError: 11770:0:(llog_server.c:471:llog_origin_handle_cancel()) Cancel 62 of 124 llog-records failed: -2
Oct 20 12:02:47 compute-2-0 kernel: LustreError: 11770:0:(llog_server.c:471:llog_origin_handle_cancel()) Cancel 62 of 124 llog-records failed: -2
LustreError: 9771:0:(handler.c:1601:mds_handle()) operation 35 on unconnected MDS from 12345-10.255.255.190 at tcp
Oct 20 12:02:49 compute-2-0 kernel: LustreError: 9771:0:(handler.c:1601:mds_handle()) operation 35 on unconnected MDS from 12345-10.255.255.190 at tcp
Lustre: 0:0:(watchdog.c:148:lcw_cb()) Watchdog triggered for pid 9891: it was inactive for 200s
Lustre: 0:0:(linux-debug.c:222:libcfs_debug_dumpstack()) showing stack for process 9816
ll_mdt_75     S 00000246 ll_mdt_94      3660  9816      1 S       00000246    9817  9815 (L-TLB)
 3556  9882      1 eef1f8e4       00000046    988300000000 00000246   9881f9743d7c f60c17b0  (L-TLB)
4add8a03 ee1337a4 d1e28c40 00000046 
       f9677b82 00000000 00000000 00000246 f9743d7c c5664780 f6af2cb0 c5663de0 4add8a03 e02c1640 00000001 
       f9677b82 00000000 a135db00 00000000 c566c780 0098e380 c566bde0 
       00000002 f60c17b0 f6cf3170 00000000 a135db00 f6cf32dc 0098e380 00000000 
       f6af2cb0 f9678064 00000286 ecfaf870 eef1e000 ecfaf9dc d1e28c40 00000000 
Call Trace:
f9678064 00000286 ee132000 e02c1640 
Call Trace:
 [<f9677b82>] [<f9677b82>] ldlm_expired_completion_wait+0xb2/0x510 [ptlrpc]
 ldlm_expired_completion_wait+0xb2/0x510 [ptlrpc]
 [<f9678064>] [<f9678064>] is_granted_or_cancelled+0x44/0x80 [ptlrpc]
 is_granted_or_cancelled+0x44/0x80 [ptlrpc]
 [<f96789fd>] [<f96789fd>] ldlm_completion_ast+0x95d/0xfb0 [ptlrpc]
 ldlm_completion_ast+0x95d/0xfb0 [ptlrpc]
 [<f969b06a>] [<f969b06a>] ldlm_process_inodebits_lock+0x27a/0x7a0 [ptlrpc]
 ldlm_process_inodebits_lock+0x27a/0x7a0 [ptlrpc]
 [<c02de9c6>] [<c02de9c6>] __cond_resched+0x14/0x39
 __cond_resched+0x14/0x39
 [<c011e7ac>] [<c011e7ac>] default_wake_function+0x0/0xc
 default_wake_function+0x0/0xc
 [<f9652cb7>] [<f9652cb7>] ldlm_lock_enqueue+0x1e7/0x11f0 [ptlrpc]
 ldlm_lock_enqueue+0x1e7/0x11f0 [ptlrpc]
 [<f9677ad0>] [<f9677ad0>] ldlm_expired_completion_wait+0x0/0x510 [ptlrpc]
 [<f9677ac0>] ldlm_expired_completion_wait+0x0/0x510 [ptlrpc]
 [<f9677ac0>] interrupted_completion_wait+0x0/0x10 [ptlrpc]
 interrupted_completion_wait+0x0/0x10 [ptlrpc]
 [<f9677ad0>] [<f9677ad0>] ldlm_expired_completion_wait+0x0/0x510 [ptlrpc]
 [<f9677ac0>] ldlm_expired_completion_wait+0x0/0x510 [ptlrpc]
 [<f9677ac0>] interrupted_completion_wait+0x0/0x10 [ptlrpc]
 interrupted_completion_wait+0x0/0x10 [ptlrpc]
 [<f96780a0>] [<f96780a0>] ldlm_completion_ast+0x0/0xfb0 [ptlrpc]
 [<f967a00a>] ldlm_completion_ast+0x0/0xfb0 [ptlrpc]
 [<f967a00a>] ldlm_cli_enqueue_local+0x91a/0xac0 [ptlrpc]
 ldlm_cli_enqueue_local+0x91a/0xac0 [ptlrpc]
 [<f9679050>] [<f9679050>] ldlm_blocking_ast+0x0/0x690 [ptlrpc]
 [<f96780a0>] ldlm_blocking_ast+0x0/0x690 [ptlrpc]
 [<f96780a0>] ldlm_completion_ast+0x0/0xfb0 [ptlrpc]
 ldlm_completion_ast+0x0/0xfb0 [ptlrpc]
 [<f8f468eb>] [<f8f468eb>] enqueue_ordered_locks+0x1db/0x850 [mds]
 enqueue_ordered_locks+0x1db/0x850 [mds]
 [<f9679050>] [<f9679050>] ldlm_blocking_ast+0x0/0x690 [ptlrpc]
 [<f96780a0>] ldlm_blocking_ast+0x0/0x690 [ptlrpc]
 [<f96780a0>] ldlm_completion_ast+0x0/0xfb0 [ptlrpc]
 ldlm_completion_ast+0x0/0xfb0 [ptlrpc]
 [<c0168e0f>] [<c0168e44>] __lookup_hash+0x47/0x95
 __lookup_hash+0x7c/0x95
 [<f8f48a8c>] [<f8f48a8c>] mds_get_parent_child_locked+0x3cc/0xe30 [mds]
 mds_get_parent_child_locked+0x3cc/0xe30 [mds]
 [<f8cdf24f>] ksocknal_queue_tx_locked+0x1cf/0x630 [ksocklnd]
 [<f895c025>] upcall_cache_get_entry+0x295/0xd10 [lvfs]
 [<f8f6fdce>] mds_open+0x70e/0x53c0 [mds] [<f8f204d1>]
 mds_getattr_lock+0x241/0x1490 [mds]
 [<c011e832>] [<f96c8a3d>] lustre_msg_get_flags+0x10d/0x1f0 [ptlrpc]
 [<f8f34040>] fixup_handle_for_resent_req+0x40/0x480 [mds]
 [<f96c8a3d>] lustre_msg_get_flags+0x10d/0x1f0 [ptlrpc]
 [<f8f34e71>] mds_intent_policy+0x9f1/0x1500 [mds]
 [<f8bbdfdb>] LNetMDBind+0x1fb/0x6b0 [lnet]
 [<f964bc9e>] __wake_up+0x29/0x3c
 [<f895c025>] ldlm_lock_remove_from_lru+0x5e/0x210 [ptlrpc]
 upcall_cache_get_entry+0x295/0xd10 [lvfs]
 [<f8cdf8e3>] ksocknal_launch_packet+0x143/0x5c0 [ksocklnd]
 [<f96c7ea3>] [<f8f34480>] mds_intent_policy+0x0/0x1500 [mds]
 [<f96531a3>] lustre_msg_string+0x43/0x510 [ptlrpc]
 [<c0170c30>] ldlm_lock_enqueue+0x6d3/0x11f0 [ptlrpc]
 [<f89dc8e0>] lustre_hash_add+0x180/0x4d0 [obdclass]
 [<f968a6ab>] ldlm_handle_enqueue+0x67b/0x23f0 [ptlrpc]
 [<f9688450>] ldlm_server_blocking_ast+0x0/0xc10 [ptlrpc]
 [<f9689060>] ldlm_server_completion_ast+0x0/0x990 [ptlrpc]
 [<f9688450>] ldlm_server_blocking_ast+0x0/0xc10 [ptlrpc]
 [<f8f29674>] mds_handle+0x21e4/0x5830 [mds]
 [<f96cac0c>] lustre_msg_get_conn_cnt+0x10c/0x1f0 [ptlrpc]
 [<c011cdc3>] activate_task+0x88/0x95
 [<f96d4d2e>] ptlrpc_check_req+0xe/0x220 [ptlrpc]
 [<f96c9a8c>] dput+0x34/0x19b
 [<f8f55c48>] mds_reint_rec+0xe8/0x3a0 [mds]
 lustre_msg_get_opc+0x10c/0x1f0 [ptlrpc]
 [<c011de69>] [<f96c4260>] lustre_pack_reply_v2+0x1f0/0x620 [ptlrpc]
 [<f8f7b3d0>] mds_update_unpack+0x320/0x550 [mds]
 [<c02de9c6>] find_busiest_group+0xdd/0x295
 [<c010aea2>] do_gettimeofday+0x1a/0x9c
 [<f96d9f8a>] ptlrpc_server_handle_request+0x28a/0x1e80 [ptlrpc]
 [<c010aea2>] do_gettimeofday+0x1a/0x9c
 [<f894218c>] lcw_update_time+0x1c/0x1a0 [libcfs]
 [<c02de2dd>]<3>LustreError: 9881:0:(handler.c:1601:mds_handle()) operation 400 on unconnected MDS from 12345-10.255.255.191 at tcp
 __cond_resched+0x14/0x39
 [<f8f2438e>] mds_reint+0xfe/0x770 [mds]
 [<f96c8a3d>] lustre_msg_get_flags+0x10d/0x1f0 [ptlrpc]
 [<f895ee68>] lprocfs_counter_add+0xf8/0x120 [lvfs]
 [<f8f34f28>] mds_intent_policy+0xaa8/0x1500 [mds]
 [<f96547a4>] ldlm_run_cp_ast_work+0x174/0x330 [ptlrpc]
 [<f9653d87>] ldlm_reprocess_queue+0xc7/0x390 [ptlrpc]
 [<f969cbd8>] ldlm_pool_recalc+0x108/0x120 [ptlrpc]
 [<f964b045>] lock_res_and_lock+0x45/0xc0 [ptlrpc]
 [<f8f34480>] mds_intent_policy+0x0/0x1500 [mds]
 [<f96531a3>] ldlm_lock_enqueue+0x6d3/0x11f0 [ptlrpc]
 [<f89dc8e0>] schedule+0x855/0x8f3
 lustre_hash_add+0x180/0x4d0 [obdclass]
 [<c02de30d>] [<f968a6ab>] ldlm_handle_enqueue+0x67b/0x23f0 [ptlrpc]
 [<f9688450>] ldlm_server_blocking_ast+0x0/0xc10 [ptlrpc]
 [<f9689060>] ldlm_server_completion_ast+0x0/0x990 [ptlrpc]
 [<f9688450>] ldlm_server_blocking_ast+0x0/0xc10 [ptlrpc]
 [<f8f29674>] mds_handle+0x21e4/0x5830 [mds]
 [<f96cac0c>] lustre_msg_get_conn_cnt+0x10c/0x1f0 [ptlrpc]
 [<c011cdc3>] activate_task+0x88/0x95
 [<f96d4d2e>] ptlrpc_check_req+0xe/0x220 [ptlrpc]
 [<f96c9a8c>] lustre_msg_get_opc+0x10c/0x1f0 [ptlrpc]
 [<c010aea2>] do_gettimeofday+0x1a/0x9c
 [<f96d9f8a>] ptlrpc_server_handle_request+0x28a/0x1e80 [ptlrpc]
 [<c010aea2>] do_gettimeofday+0x1a/0x9c
 [<f894218c>] lcw_update_time+0x1c/0x1a0 [libcfs]
 [<f894218c>] lcw_update_time+0x1c/0x1a0 [libcfs]
 [<c011e832>] __wake_up+0x29/0x3c
 [<f89423c0>] lc_watchdog_touch_ms+0xb0/0x280 [libcfs]
 [<f894263c>] lc_watchdog_disable+0x7c/0x240 [libcfs]
 [<f89345b3>] cfs_block_sigs+0x73/0xa0 [libcfs]
 [<f96dcb3b>] ptlrpc_main+0x62b/0xdb0 [ptlrpc]
 [<f96dc500>] ptlrpc_retry_rqbds+0x0/0x10 [ptlrpc]
 [<f96dc500>] ptlrpc_retry_rqbds+0x0/0x10 [ptlrpc]
 [<c02e03b2>] schedule+0x885/0x8f3
 [<f89423c0>] lc_watchdog_touch_ms+0xb0/0x280 [libcfs]
 [<f89345b3>] cfs_block_sigs+0x73/0xa0 [libcfs]
 [<f96dcb3b>] ptlrpc_main+0x62b/0xdb0 [ptlrpc]
 [<c011e7ac>] default_wake_function+0x0/0xc
 [<f96dc500>] ptlrpc_retry_rqbds+0x0/0x10 [ptlrpc]
 [<c02e03b2>] ret_from_fork+0x6/0x14
 [<f96dc500>] ptlrpc_retry_rqbds+0x0/0x10 [ptlrpc]
 [<f96dc510>] ptlrpc_main+0x0/0xdb0 [ptlrpc]
 [<c01041f5>] kernel_thread_helper+0x5/0xb
-------8<---------- 

Followed by pages and pages of call traces. The system locked up and I
had to reboot the MDS (I managed by using the console, ssh wasn't
possible anymore). I attached the full log from the reboot yesterday to
todays reboot.

Reading the logs myself, it seems to be that the MDS looses connection
to one of the OSS, and tries to recover from it. Afterwards, things to
go haywire and the qmaster seems to crash? But I have trouble decrypting
the logs.

Can you see what's going on?

Cheers,
Arne


PS: a small recap: 1 MDS, 4 OSS, ~80 clients, ~350 cpus, max of ~1000
active jobs.



On Fr, 2009-10-09 at 10:23 -0400, Brian J. Murrell wrote:
> On Fri, 2009-10-09 at 16:15 +0200, Arne Brutschy wrote:
> > Hi,
> 
> Hi,
> 
> > thanks for replying!
> 
> NP.
> 
> > I understand that without further information we can't do much about the
> > oopses.
> 
> Not just the oopses.  It's pretty hard to diagnose any problem without
> data.
> 
> > I was more hoping for some information regarding possible
> > sources of such an overload.
> 
> The possibilities are endless.  Lustre is a complex system with many,
> many interactions with many other components.  Sure, I could make stab
> in the dark guesses like too little memory, too slow disk, etc., but
> none of those are actually useful for anything more than a shopping trip
> and even then new hardware might still not solve your problem.
> 
> > How can I find the source of the problem?
> 
> Examine the logs.  Lustre logs a lot of information when things are
> going wrong.
> 
> b.
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
-- 
Arne Brutschy
Ph.D. Student                    Email    arne.brutschy(AT)ulb.ac.be
IRIDIA CP 194/6                  Web      iridia.ulb.ac.be/~abrutschy
Universite' Libre de Bruxelles   Tel      +32 2 650 3168
Avenue Franklin Roosevelt 50     Fax      +32 2 650 2715
1050 Bruxelles, Belgium          (Fax at IRIDIA secretary)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mds.log.bz2
Type: application/x-bzip
Size: 21908 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20091020/4363e24d/attachment.bin>


More information about the lustre-discuss mailing list