[Lustre-discuss] slow journal/commitrw on OSTs lead to crash

Hendelman, Rob Rob.Hendelman at magnetar.com
Wed Apr 8 11:16:10 PDT 2009


Hello,

I recently upgraded our 1.6.4.3 centos servers to 1.6.7.  Clients are
still at 1.6.4.3.  We were going to upgrade to 1.6.7 but decided to wait
until 1.6.7.1 comes out with the fix for the bug mentioned on the list
late last week.

Our OST's are almost completely full (85-90% on each) and our MDT is
almost entirely empty.

Our setup is as follows:

Server nodes
============
Node 1 = MGS + MDT (separate mount points) with raid 10 of 146G sas 15k
disks, 1 gigabit ethernet
Node 4 = OSS with 10 OST's connected to disks via multipath FC.  Each
OST is a raid6 (done on the san side) of (I believe 12) 146G 15k disks.
Each node has 20G ram.

Our clients are running with 2.6.22 patchless kernel & lustre 1.4.3
Client nodes have 32G ram ea.

Each filesystem (mgs, mdt, ost) also has a separate device in either
raid 1 or 10 for the journal.

Last night around 6pm I started getting alerts that our lustre OSS was
unresponsive.  Looking in the logs, I found the following:

==========================================================
Apr  7 18:07:53 maglustre04 last message repeated 2 times
Apr  7 18:08:48 maglustre04 kernel: Lustre:
5228:0:(lustre_fsfilt.h:320:fsfilt_commit_wait()) fs01-OST0005: slow
journal start 31s
Apr  7 18:08:48 maglustre04 kernel: Lustre:
5228:0:(filter_io_26.c:792:filter_commitrw_write()) fs01-OST0005: slow
commitrw commit 31s
Apr  7 18:08:48 maglustre04 kernel: Lustre:
5227:0:(lustre_fsfilt.h:320:fsfilt_commit_wait()) fs01-OST0005: slow
journal start 30s
Apr  7 18:08:48 maglustre04 kernel: Lustre:
5218:0:(filter_io_26.c:792:filter_commitrw_write()) fs01-OST0005: slow
commitrw commit 30s
Apr  7 18:08:48 maglustre04 kernel: Lustre:
5227:0:(lustre_fsfilt.h:320:fsfilt_commit_wait()) Skipped 1 previous
similar message
Apr  7 18:10:15 maglustre04 kernel: Lustre:
6588:0:(ldlm_lib.c:541:target_handle_reconnect()) fs01-OST0006:
dc9faf61-04e0-3f57-0cba-405fcac557f7 reconnecting
Apr  7 18:10:15 maglustre04 kernel: Lustre:
6588:0:(ldlm_lib.c:780:target_handle_connect()) fs01-OST0006: refuse
reconnection from dc9faf61-04e0-3f57-0cba-405fcac557f7 at 10.5.20
0.13 at tcp to 0xffff810156556000; still busy with 2 active RPCs
Apr  7 18:10:15 maglustre04 kernel: LustreError:
6588:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing error
(-16)  req at ffff81026ddfd800 x21081511/t0 o8->dc9faf61-04
e0-3f57-0cba-405fcac557f7@:0/0 lens 304/200 e 0 to 0 dl 1239145915 ref 1
fl Interpret:/0/0 rc -16/0
Apr  7 18:10:15 maglustre04 kernel: LustreError:
6588:0:(ldlm_lib.c:1643:target_send_reply_msg()) Skipped 2 previous
similar messages
Apr  7 18:10:23 maglustre04 kernel: LustreError:
5269:0:(filter_io_26.c:779:filter_commitrw_write()) fs01-OST0006: slow
direct_io 107s
Apr  7 18:10:23 maglustre04 kernel: Lustre:
5269:0:(service.c:1317:ptlrpc_server_handle_request()) @@@ Request
x21080837 took longer than estimated (100+8s); client may timeou
t.  req at ffff810529d42400 x21080837/t4450358
o4->dc9faf61-04e0-3f57-0cba-405fcac557f7@:0/0 lens 384/352 e 0 to 0 dl
1239145815 ref 1 fl Complete:/0/0 rc 0/0
Apr  7 18:10:25 maglustre04 kernel: LustreError:
5270:0:(lustre_fsfilt.h:320:fsfilt_commit_wait()) fs01-OST0008: slow
journal start 109s
Apr  7 18:10:25 maglustre04 kernel: LustreError:
5270:0:(filter_io_26.c:792:filter_commitrw_write()) fs01-OST0008: slow
commitrw commit 109s
Apr  7 18:10:25 maglustre04 kernel: Lustre:
5270:0:(service.c:1317:ptlrpc_server_handle_request()) @@@ Request
x99233655 took longer than estimated (100+10s); client may timeo
ut.  req at ffff8104dd112800 x99233655/t4401190
o4->19615549-595b-2174-a224-57b10ed10a9c@:0/0 lens 384/352 e 0 to 0 dl
1239145815 ref 1 fl Complete:/0/0 rc 0/0
Apr  7 18:10:26 maglustre04 kernel: LustreError:
5210:0:(filter_io_26.c:714:filter_commitrw_write()) fs01-OST0009: slow
i_mutex 122s
Apr  7 18:10:26 maglustre04 kernel: LustreError:
5184:0:(filter_io_26.c:714:filter_commitrw_write()) fs01-OST0009: slow
i_mutex 122s
Apr  7 18:10:26 maglustre04 kernel: Lustre:
5184:0:(service.c:1317:ptlrpc_server_handle_request()) @@@ Request
x99233618 took longer than estimated (100+22s); client may timeo
ut.  req at ffff8104a65f5c50 x99233618/t4374309
o4->19615549-595b-2174-a224-57b10ed10a9c@:0/0 lens 384/352 e 0 to 0 dl
1239145804 ref 1 fl Complete:/0/0 rc 0/0
Apr  7 18:10:26 maglustre04 kernel: Lustre:
5184:0:(service.c:1317:ptlrpc_server_handle_request()) Skipped 1
previous similar message
Apr  7 18:10:26 maglustre04 kernel: LustreError:
5241:0:(filter_io_26.c:779:filter_commitrw_write()) fs01-OST0009: slow
direct_io 122s
Apr  7 18:10:27 maglustre04 kernel: Lustre:
6341:0:(ldlm_lib.c:541:target_handle_reconnect()) fs01-OST0001:
e293e1d4-51a4-a8f4-a7c0-c1aaab662a94 reconnecting
Apr  7 18:10:27 maglustre04 kernel: Lustre:
5139:0:(ldlm_lib.c:541:target_handle_reconnect()) fs01-OST0000:
e293e1d4-51a4-a8f4-a7c0-c1aaab662a94 reconnecting
Apr  7 18:10:27 maglustre04 kernel: Lustre:
6398:0:(ldlm_lib.c:780:target_handle_connect()) fs01-OST0006: refuse
reconnection from e293e1d4-51a4-a8f4-a7c0-c1aaab662a94 at 10.5.20
0.12 at tcp to 0xffff810525b80000; still busy with 9 active RPCs
Apr  7 18:10:27 maglustre04 kernel: Lustre:
6341:0:(ldlm_lib.c:541:target_handle_reconnect()) Skipped 6 previous
similar messages
Apr  7 18:10:32 maglustre04 kernel: Lustre:
6439:0:(ldlm_lib.c:541:target_handle_reconnect()) fs01-OST0009:
5824cc81-63ce-7c87-cdd6-42422d0c4dcd reconnecting
Apr  7 18:10:32 maglustre04 kernel: Lustre:
6439:0:(ldlm_lib.c:541:target_handle_reconnect()) Skipped 6 previous
similar messages
Apr  7 18:10:34 maglustre04 kernel: Lustre:
6477:0:(ldlm_lib.c:541:target_handle_reconnect()) fs01-OST0009:
19615549-595b-2174-a224-57b10ed10a9c reconnecting
Apr  7 18:10:34 maglustre04 kernel: Lustre:
6477:0:(ldlm_lib.c:541:target_handle_reconnect()) Skipped 1 previous
similar message
Apr  7 18:10:37 maglustre04 kernel: Lustre:
6430:0:(ldlm_lib.c:541:target_handle_reconnect()) fs01-OST000a:
5824cc81-63ce-7c87-cdd6-42422d0c4dcd reconnecting
Apr  7 18:10:37 maglustre04 kernel: Lustre:
6430:0:(ldlm_lib.c:541:target_handle_reconnect()) Skipped 10 previous
similar messages
Apr  7 18:10:41 maglustre04 kernel: Lustre:
6574:0:(ldlm_lib.c:541:target_handle_reconnect()) fs01-OST0002:
c454c19b-6df7-3d89-33b6-e421b8135c62 reconnecting
Apr  7 18:10:41 maglustre04 kernel: Lustre:
6413:0:(ldlm_lib.c:541:target_handle_reconnect()) fs01-OST0001:
c454c19b-6df7-3d89-33b6-e421b8135c62 reconnecting
Apr  7 18:10:41 maglustre04 kernel: Lustre:
6413:0:(ldlm_lib.c:541:target_handle_reconnect()) Skipped 2 previous
similar messages
Apr  7 18:10:41 maglustre04 kernel: Lustre:
6574:0:(ldlm_lib.c:541:target_handle_reconnect()) Skipped 6 previous
similar messages
Apr  7 18:10:51 maglustre04 kernel: Lustre:
6598:0:(ldlm_lib.c:541:target_handle_reconnect()) fs01-OST0005:
e293e1d4-51a4-a8f4-a7c0-c1aaab662a94 reconnecting
Apr  7 18:10:51 maglustre04 kernel: Lustre:
6598:0:(ldlm_lib.c:541:target_handle_reconnect()) Skipped 2 previous
similar messages
Apr  7 18:11:07 maglustre04 kernel: Lustre:
5027:0:(ldlm_lib.c:780:target_handle_connect()) fs01-OST0005: refuse
reconnection from 19615549-595b-2174-a224-57b10ed10a9c at 10.5.20
0.15 at tcp to 0xffff8102a7270000; still busy with 2 active RPCs
Apr  7 18:11:16 maglustre04 kernel: Lustre:
5076:0:(ldlm_lib.c:541:target_handle_reconnect()) fs01-OST0006:
e293e1d4-51a4-a8f4-a7c0-c1aaab662a94 reconnecting
Apr  7 18:11:16 maglustre04 kernel: Lustre:
5076:0:(ldlm_lib.c:541:target_handle_reconnect()) Skipped 17 previous
similar messages
Apr  7 18:12:11 maglustre04 kernel: Lustre:
5180:0:(filter_io.c:551:filter_preprw_write()) fs01-OST0008: slow
preprw_write setup 37s
Apr  7 18:12:45 maglustre04 kernel: Lustre:
5220:0:(filter_io_26.c:779:filter_commitrw_write()) fs01-OST0003: slow
direct_io 33s
Apr  7 18:12:48 maglustre04 kernel: Lustre:
11010:0:(socklnd_cb.c:2728:ksocknal_check_peer_timeouts()) Stale ZC_REQs
for peer 10.5.200.12 at tcp detected: 1; the oldest (ffff8105
1f1de000) timed out 9 secs ago
Apr  7 18:12:56 maglustre04 kernel: Lustre:
5063:0:(ldlm_lib.c:541:target_handle_reconnect()) fs01-OST0006:
e293e1d4-51a4-a8f4-a7c0-c1aaab662a94 reconnecting
Apr  7 18:12:56 maglustre04 kernel: Lustre:
5063:0:(ldlm_lib.c:541:target_handle_reconnect()) Skipped 1 previous
similar message
Apr  7 18:12:56 maglustre04 kernel: Lustre:
5063:0:(ldlm_lib.c:780:target_handle_connect()) fs01-OST0006: refuse
reconnection from e293e1d4-51a4-a8f4-a7c0-c1aaab662a94 at 10.5.20
0.12 at tcp to 0xffff810525b80000; still busy with 2 active RPCs
Apr  7 18:12:56 maglustre04 kernel: LustreError:
5063:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing error
(-16)  req at ffff8103dc39bc00 x243110586/t0 o8->e293e1d4-5
1a4-a8f4-a7c0-c1aaab662a94@:0/0 lens 304/200 e 0 to 0 dl 1239146076 ref
1 fl Interpret:/0/0 rc -16/0
Apr  7 18:12:56 maglustre04 kernel: LustreError:
5063:0:(ldlm_lib.c:1643:target_send_reply_msg()) Skipped 2 previous
similar messages
Apr  7 18:12:56 maglustre04 kernel: LustreError:
5209:0:(ost_handler.c:899:ost_brw_read()) @@@ timeout on bulk PUT after
68+0s  req at ffff8102e7357800 x243110512/t0 o3->e293e1d4
-51a4-a8f4-a7c0-c1aaab662a94@:0/0 lens 384/336 e 0 to 0 dl 1239145976
ref 1 fl Interpret:/0/0 rc 0/0
Apr  7 18:12:57 maglustre04 kernel: Lustre:
5215:0:(filter_io_26.c:779:filter_commitrw_write()) fs01-OST0007: slow
direct_io 39s
Apr  7 18:12:57 maglustre04 kernel: Lustre:
5209:0:(ost_handler.c:957:ost_brw_read()) fs01-OST0006: ignoring bulk IO
comm error with e293e1d4-51a4-a8f4-a7c0-c1aaab662a94@ id 1
2345-10.5.200.12 at tcp - client will retry
Apr  7 18:12:57 maglustre04 kernel: Lustre:
5209:0:(service.c:1317:ptlrpc_server_handle_request()) @@@ Request
x243110512 took longer than estimated (100+1s); client may timeo
ut.  req at ffff8102e7357800 x243110512/t0
o3->e293e1d4-51a4-a8f4-a7c0-c1aaab662a94@:0/0 lens 384/336 e 0 to 0 dl
1239145976 ref 1 fl Complete:/0/0 rc 0/0
Apr  7 18:12:57 maglustre04 kernel: Lustre:
5209:0:(service.c:1317:ptlrpc_server_handle_request()) Skipped 6
previous similar messages
========================================================================
Soon the OSS started refusing connections:
========================================================================
======
Apr  7 18:18:30 maglustre04 kernel: Lustre:
5206:0:(filter_io.c:551:filter_preprw_write()) fs01-OST0002: slow
preprw_write setup 42s
Apr  7 18:18:34 maglustre04 kernel: Lustre:
5079:0:(ldlm_lib.c:780:target_handle_connect()) fs01-OST0008: refuse
reconnection from e293e1d4-51a4-a8f4-a7c0-c1aaab662a94 at 10.5.20
0.12 at tcp to 0xffff810337f8a000; still busy with 2 active RPCs
Apr  7 18:18:34 maglustre04 kernel: LustreError:
5079:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing error
(-16)  req at ffff81052b724400 x243111082/t0 o8->e293e1d4-5
1a4-a8f4-a7c0-c1aaab662a94@:0/0 lens 304/200 e 0 to 0 dl 1239146414 ref
1 fl Interpret:/0/0 rc -16/0
Apr  7 18:18:34 maglustre04 kernel: LustreError:
5079:0:(ldlm_lib.c:1643:target_send_reply_msg()) Skipped 1 previous
similar message
Apr  7 18:18:38 maglustre04 kernel: LustreError:
11000:0:(events.c:66:request_out_callback()) @@@ type 4, status -5
req at ffff8104e3853400 x208570/t0 o400->MGS at MGC10.5.10.11@tc
p_0:26/25 lens 128/256 e 0 to 1 dl 1239146349 ref 2 fl Rpc:N/0/0 rc 0/0
Apr  7 18:18:38 maglustre04 kernel: LustreError:
11000:0:(events.c:66:request_out_callback()) @@@ type 4, status -5
req at ffff81051e07fc00 x208571/t0 o400->MGS at MGC10.5.10.11@tc
p_0:26/25 lens 128/256 e 0 to 1 dl 1239146374 ref 2 fl Rpc:N/0/0 rc 0/0
Apr  7 18:18:38 maglustre04 kernel: Lustre: Request x208570 sent from
MGC10.5.10.11 at tcp to NID 10.5.10.11 at tcp 69s ago has timed out (limit
100s).
Apr  7 18:18:38 maglustre04 kernel: LustreError: 166-1:
MGC10.5.10.11 at tcp: Connection to service MGS via nid 10.5.10.11 at tcp was
lost; in progress operations using this service
 will fail.
Apr  7 18:18:38 maglustre04 kernel: Lustre: MGC10.5.10.11 at tcp:
Reactivating import
Apr  7 18:18:38 maglustre04 kernel: Lustre: MGC10.5.10.11 at tcp:
Connection restored to service MGS using nid 10.5.10.11 at tcp.
Apr  7 18:18:45 maglustre04 kernel: Lustre: fs01-OST0000: received MDS
connection from 10.5.10.11 at tcp
Apr  7 18:18:45 maglustre04 kernel: Lustre: Skipped 10 previous similar
messages
Apr  7 18:18:45 maglustre04 kernel: Lustre:
6448:0:(filter.c:2849:filter_destroy_precreated()) fs01-OST0000:
deleting orphan objects from 184495 to 184526
Apr  7 18:18:45 maglustre04 kernel: Lustre:
6448:0:(filter.c:2849:filter_destroy_precreated()) Skipped 8 previous
similar messages
Apr  7 18:18:53 maglustre04 kernel: LustreError:
11004:0:(events.c:367:server_bulk_callback()) event type 4, status -5,
desc ffff8104f2272000
Apr  7 18:18:53 maglustre04 kernel: LustreError:
11004:0:(events.c:367:server_bulk_callback()) event type 4, status -5,
desc ffff81051f681000
Apr  7 18:18:56 maglustre04 kernel: LustreError:
5226:0:(ost_handler.c:1131:ost_brw_write()) @@@ timeout on bulk GET
after 46+0s  req at ffff8104e516d800 x99305448/t0 o4->1961554
9-595b-2174-a224-57b10ed10a9c@:0/0 lens 384/352 e 0 to 0 dl 1239146336
ref 1 fl Interpret:/0/0 rc 0/0
Apr  7 18:18:56 maglustre04 kernel: Lustre:
5226:0:(ost_handler.c:1277:ost_brw_write()) fs01-OST0005: ignoring bulk
IO comm error with 19615549-595b-2174-a224-57b10ed10a9c@ id
 12345-10.5.200.15 at tcp - client will retry
Apr  7 18:19:01 maglustre04 kernel: Lustre:
6350:0:(ldlm_lib.c:780:target_handle_connect()) fs01-OST0008: refuse
reconnection from e293e1d4-51a4-a8f4-a7c0-c1aaab662a94 at 10.5.20
0.12 at tcp to 0xffff810337f8a000; still busy with 2 active RPCs
Apr  7 18:19:13 maglustre04 last message repeated 2 times
Apr  7 18:19:13 maglustre04 last message repeated 2 times
============================================================

Considering we have a _very_ small installation compared to almost
everyone else on the list with 2 servers & 5 client nodes I'm not sure
exactly where to start looking.

I did find a thread from December 08 with similar error messages but it
seemed confined to a single OST.  This seemed to hit all our OST's one
after the other.  We are using the default striping which only has 1
file per OST as well.

Storage Group reported no errors on FC switch or disks on the SAN.
Networking group reported no errors on the gigabit Ethernet switch so I
don't think it is a networking error.

This is the only issue we've had so far after upgrading to 1.6.7 on the
servers in approximately 3 weeks of regular use.

Any suggestions are appreciated.

Also: If anyone over there @ Sun has a contact of someone in sales I can
call for a commercial support contract, this is something we are
interested in looking at as well.

Thanks,

Robert

Robert Hendelman Jr
Magnetar Capital LLC
Rob.Hendelman at magnetar.com
1-847-905-4557



The information contained in this message and its attachments 
is intended only for the private and confidential use of the 
intended recipient(s).  If you are not the intended recipient 
(or have received this e-mail in error) please notify the 
sender immediately and destroy this e-mail. Any unauthorized 
copying, disclosure or distribution of the material in this e-
mail is strictly prohibited.



More information about the lustre-discuss mailing list