[Lustre-discuss] Unable to activate inactive OSTs

Dan dan at nerp.net
Wed Apr 14 09:50:37 PDT 2010


Chris,

I've not upgraded or changed configuration.  Running RHEL 4 w/ Lustre
1.6.7.2.  An OSS crasshed and some OSTs show a fail to recover on the
MDT but the OSS looks fine, interesting?  There are countless pages of
errors - here is a good sample of what I'm seeing.



Apr 11 04:04: 19 gto kernel: LustreError: 4228:0:(mds_open.c:
I567:mds_close()) Skipped 5 previous similar messages
Apr II 04:04: 19 gto kernel: LustreError: 4228:0:(ldlm_lib.c:
1643 :targeuend_reply-msgO) @@ @ processing error (-116) req at OOOOOI 0
120cd8400 x I 15633406/tO o35->dd3dbaa4fd91-
7e4c-a254-6ccc5b050949 at NET_Ox2000080ae02c6_UUID:0/0 lens 296/1456 e 0 to
0 dl 1270983959 ref 1 fl Interpret/2/0 rc -116/0
Apr 1104:04:19gtokernel:LustreError:4228:0:(ldlm_lib.c:1643
_msg())Skipped5previous similarmessages
Apr 11 04:05:59 gto kernel: Lustre:
5309:0:(ldlm_lib.c:54l:target_handle_reconnect()) feline-MDTOOOO:
dd3dbaa4-fd91-7e4c-a254-6ccc5b050949 reconnecting
Apr 11 04:05:59 gto kernel: Lustre:
5309:0:(ldlm_lib.c:541:targechandle_reconnectO) Skipped 5 previous
similar messages
Apr 11 04:14:19 gto kernel: LustreError:
5911 :O:(mds_open.c:1567:mds_close()) @@@ no handle for file close ino
7208962: cookie Oxb9c67340d2497975 reg at 000001005bb03000
x115633406/tO
035->dd3dbaa4-fd91-7e4c-a254-6ccc5b050949 at NET_Ox2000080ae02c6_UU1D:O/O
lens 296/1456 e 0 to 0 dl 1270984559 ref I fl Interpret/2/O
rc 0/0

Apr 415:41:15gto
kernel:LustreError:32555 :0:(lov_request.c:692lov_update_create_set(»error creatingfid Ox780Ice sub-objectonOSTidx 16/1:rc=-110
Apr 5 II :10:14 gto kernel: LustreError:
32581:0:(1l0g_obd.c:226:11og_add()) Skipped 2 previous similar messages 
Apr 5
11:10:14gtokernel:LustreError:32581:O:(Iov_Iog.c:118:lov_llog_origin_add(»Can'taddllog (rc = -19) for stripe 0 
Apr 511:10:14 gto kernel: LustreError:
32581:0:(lov_log.c:118:lov_llog_origin_addO) Skipped 2 previous similar
messages 
Apr
511:10:15gtokernel:LustreError:32566:0:(llog_obd.c:226:11og_add())Noctxt 
Apr 5Il:!0:15gtokernel:LustreError:32566:0:(llog_obd.c:226:
llog_add())Skipped 71previoussimilarmessages 
Apr 5 11:1 0: 15 gto kernel: LustreError: 32566:0:(lov_Iog.c:
118:lov_llog_origin_add()) Can't add llog (rc =-19) for stripe 0 
Apr 5
II:10:15gtokernel:LustreError:32566:0:(Iov_Iog.c:118:lov_llog_origin_add())Skipped 71previoussimilarmessages 
Apr 5 11: 10:16 gto kernel: LustreError: 32561 No ctxt

Apr 6 15: 14: 16 gto kernel: LustreError: 32557:0:(ldlm_lib.c:
1643:targecsend_reply-msg()) @@@ processing error (-16)
req at OOOOOlOld3976000 x1655 litO 038->6271429a-Ie255630-
4b4c-42a685104c79 @NELOx2000080ae0297_UUID:0/0 lens 304/200 e 0 to 0 dl
1270592156 ref I f1 Interpret:lOIO rc -16/0
Apr 6 15: 14:16 gto kernel: Lustre:
32552:0:(service.c:1317:ptlrpc_servechandle_requestO) @@ @ Request
x16479 took longer than estimated (l00+50s); client may timeout.
req at OOOOOI002958dOOO x16479/t744881971 0
101->6271429a-Ie25-5630-4b4c-42a685104c79 at NET_Ox2000080ae0297_UUID:O/O
lens 512/472 eO to 0 dl 1270592006 ref I f1 Complete:/O/O rc 3011301

Apr 6 15:50: 19 gto kernel: LustreError: 11-0: an error occurred while
communicating with 128.174.2.107 at tcp. The ost_connect operation failed
with -19
Apr 6 15:55:44 gto kernel: LustreError:
32553:0:(1ov_request.c:692:lov_update_create_set()) error creating fid
Oxd7039b sub-object on OST idx 8/1: rc = -110
Apr 6 15:59:52 gto kernel: LustreError:
32569:0:(1dlm_lib.c:I643:targecsend_reply-msg()) @@@ processing error
(-16) req at 0000010007229800 xI8429/tO 038->6271429a-Ie255630-
4b4c-42a685I04c79 @NET_Ox2000080ae0297_UUID:0/0lens304/200e0to0dl
1270594892refI f1 Interpret:!O/O rc -16/0
Apr 6 15:59:52 gto kernel: LustreError:
32569:0:(1dlm_lib.c:I643:targecsend_reply-msg()) @@@ processing error
(-16) req at 0000010007229800 xI8429/tO 038->6271429a-Ie255630-
4b4c-42a685I04c79 @NET_Ox2000080ae0297_UUID:0/0lens304/200e0to0dl
1270594892refI f1 Interpret:!O/O rc -16/0
Apr 616:56:57 gto kernel: LustreError:
1437:0:(events.c:66:requescout_callback()) @@@ type 4, status
req at 00000100a4bc6000 xl0737832/tO 08->felineOST0005_
UUID@ 128.174.2.192 at tcp:28/4 lens 304/456 e 0 to 1 dl 1270598222 ref 2
f1 Rpc:N/O/O rc 0/0
Apr 616:56:57 gto kernel: LustreError:
1437:0:(events.c:66:requescout_callback()) @@@ type 4, status
req at 00000100a4bc6000 xl0737832/tO 08->felineOST0005_
UUID@ 128.174.2.192 at tcp:28/4 lens 304/456 e 0 to 1 dl 1270598222 ref 2
f1 Rpc:N/O/O rc 0/0

Apr 6 17:40:09 gto kernel: LustreError: 5203 lov_llog_init err
Apr 6 17:40:09 gto kernel: LustreError:
5203:0:(1l0R-obd.c:439:1l0g_caUnitialize()) rc: -2
Apr
617:40:09gtokernel:Lustre:530I:0:(mds_open.c:841 :mds_open_by_fid())Orphand286f8:75f5909ffound andopenedin PENDINGdirectory
Apr 6
17:40:13gtokernel:Lustre:feline-MDTOOOO:sendingdelayedrepliestorecoveredclients
Apr 6 17:40:13gtokernel:
Lustre:5315:0:(mds_unlink_open.c:266:mds_cleanup_pending())feline-MDTOOOO:orphand286f8:75f5909fre-openedduring recovery
Apr 6 17:40: 13 gto kernel: Lustre: 5315:0:(quota_master.c:
1678:mds_quota_recovery()) Not all osts are active, abort quota recovery
Apr 6 17:40: 13 gto kernel: Lustre: feline-MDTOOOO: recovery complete:
rc 0
Apr 6 17:40: 13 gto kernel: LustreError: 2:llog_lvfs_create()) error
looking up logfile Ox625001 a:Ox76682f22: rc -2
Apr 617:40:13 gto kernel: LustreError: 5480:0: 612:11og_lvfsJreate())
Skipped I previous similar message

Apr 6 17:40: 13 gto kernel: LustreError: 5480:0:(1log_cat.c:
I72:11og_catjd2handle()) error opening log id Ox62500 Ia:76682f22: rc -2
Apr 6 17:40: 13 gto kernel: LustreError: 5480:0:(llog_cat.c:
I72:11og_caUd2handle(» Skipped I previous similar message
Apr 6 17:40: 13 gto kernel: LustreError:
5480:0:(1log_obd.c:279:caccancel_cb()) Cannot find handle for log
Ox62500la
Apr 617:40:13 gto kernel: LustreError:
5480:0:(llog_obd.c:279:caccanceLcb()) Skipped I previous similar message
Apr 6 17:40: 13 gto kernel: LustreError:
5479:0:(llog_obd.c:350:llog_obd_origin_setup()) with cat_canceLcb
failed: -2
Apr 6 17:40: 13 gto kernel: LustreError:
5479:0:(Ilog_obd.c:350:llog_obd_origin_setup(» Skipped I previous
similar message
Apr 6 17:40:13 gto kernel: LustreError:
5479:0:(Iov_log.c:243:lov_llog_init()) Skipped I previous similar
message
Apr 6 17:40: 13 gto kernel: LustreError:
5479:0:(mds_log.c:219:mds_lloK-init()) 10v_1I0g_init err-2
Apr 617:40:13
gtokernel:LustreError:5479:0:(mds_log.c:219:mds_llog_init())Skipped1previous similarmessage
Apr
617:40:13gtokernel:LustreError:5479:0:(1I0g_obd.c:439:llog_caUnitialize())rc:-2
Apr 6 17:40: 13 gto kernel: LustreError:
5479:0:(1I0g_obd.c:439:llog_caUnitialize()) Skipped 1 previous similar
message
Apr 6 17:40:13 gto kernel: LustreError:
5479:0:(mds_Iov.c:918:_mds_lov_synchronize()) feline-OSTOOI3_UUlD failed
at update_mds:-2
Apr 617:40: 13gtokernel:LustreError:5479:0:(mds_lov.c:960:_
mds_lov_synchronize())feline-OSTOOI3_UUlD syncfailed-2,deactivating
Apr 6 17:40:13 gto kernel: LustreError:
5460:0:(mds_Iov.c:552:mds_lov_updale_mds()) Failed to get objid --3 6
1:LustreError: 546D:0:(mds_Iov.c:918:_mds_lov_synchronizeO)
feline-OSTOOOO_UUlD failed at update_mds: -3
Apr 6 17:40: 13 gto kernel: LustreError:
5460:0:(mds_lov.c:960:_mds_lov_synchronize()) feline-OSTOOOO_UUlD sync
failed -3, deactivating
Apr 617:40:13 gto kernel: Lustre: MDS feline-MDTOOOO:
feline-OSTOOI0_UUlD now active, resetting orphans
Apr 617:40:13 gto kernel : Lustre: MDS feline-MDTOOOO:
feline-OSTOOOCUUlD now active, resetting orphans
Apr 6 17:40: 13 gto kernel: LustreError:
5461 :O:(mds_Iov.c:552:mds_Iov_update_mds()) Failed to get objid --3
Apr 6 17:40: 13 gto kernel: LustreError:
5464:0:(osccreate.c:362:osc_create()) feline-OST0004-osc: oscc recovery
failed: -II
Apr 617:40:
13gtokernel :LustreError:5464:0:(1ov_obd.c:1048:lov_clear_orphans())error inorphanrecovery onOSTidx4/20: rc=-11
Apr 6 17:40:13 gto kernel: LustreError:
5464:0:(mds_lov.c:95I:_mds_Iov_synchronize()) feline-OST0004_UUlD failed
at mds_IovJlear_orphans:-II
Apr 6 17:40: 13 gto kernel: LustreError:
5465:0:(osccreate.c:362:osccreate()) feline-OSTOOO5-osc: oscc recovery
failed: -11
Apr 617:40:
13gtokernel:LustreError:5465:0:(Iov_obd.c:1048:lov_cleacorphans())error
inorphanrecovery onOSTidx 5120: rc = -II
Apr 617:40:13 gto kernel :LustreError: 5465:0:(mds_Iov.c:95l:_
mds_lov_synchronize()) feline-OST0005_UUlD failed at
mds_Iov_clear_orphans: -11
Apr 6 17:40: 13 gto kernel: LustreError:
5466:0:(osc_create.c:362:osc_create()) feline-OST0006-osc: oscc recovery
failed: -II
Apr 6 17:40: 13 gto kernel: LustreError:
5467:0:(osc_create.c:362:osc_create()) feline-OST0007-osc: oscc recovery
failed: -11
Apr 6 17:40: 13 gto kernel: LustreError:
5468:0:(osc_create.c:362:osc_create()) feline-OST0008-osc: oscc recovery
failed: -II
Apr 6 17:40: 13 gto kernel: LustreError:
5469:0:(osccreate.c:362:osc_create()) feline-OST0009-osc : oscc recovery
failed: -11

pr 6 17:47: 18 gto kernel: LustreError: 5865:0:(Ilog_lvfs.c:612:1I0g_1
vfs_create()) error looking up logfile Ox6250020:0x76682f2b: rc -2 
Apr 6 17:47:18 gto kernel: LustreError:
5865:0:(llog_lvfs.c:612:1I0g_lvfs_createO) Skipped 2 previous similar
messages 
Apr 617:47:
18gtokernel:LustreError:5865:0:(llog_cat.c:I72:11og_caUd2handleO)error
openinglog idOx6250020:76682f2b:rc-2 
Apr
617:47 :18gtokernel:LustreError:5865:0:(1l0g_cat.c:172:11og_caUd2handleO)Skipped2previoussimilarmessages 
Apr 6 17:47 :18 gto kernel: LustreError:
5865:0:(1l0g_obd.c:279:caccancel_cbO) Cannot find handle for log
Ox6250020 
Apr 6 17:47: 18 gto kernel: LustreError:
5865:0:(1I0g_obd.c:279:caccanceIJb()) Skipped 2 previous similar
messages 
Apr 6 17:47:18 gto kernel: LustreError:
5863:0:(llog_obd.c:350:110K-obd_origin_setupO) with failed: -2 
Apr 6 17:47: 18 gto kernel: LustreError: Skipped 2 previous similar
messages

Thanks,

Dan
-------------- next part --------------
An embedded message was scrubbed...
From: "Christopher J. Morrone" <morrone2 at llnl.gov>
Subject: Re: [Lustre-discuss] Unable to activate inactive OSTs
Date: Fri, 09 Apr 2010 13:54:55 -0700
Size: 1930
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100414/f9788208/attachment.eml>


More information about the lustre-discuss mailing list