[Lustre-discuss] MDT was remounted to read-only after MDS relocation

Kalosi, Akos Akos.Kalosi at hp.com
Wed Jun 20 03:14:39 PDT 2012


Hi,
Our customer experienced MDT remounted read-only after MDS relocation to other cluster node.
They also started relocation of OSS services during the same time.
When they noticed they tried to stop the MDS. The attempt to stop MDS was unsuccessful, the server got unresponsive and the other cluster node  fenced the MDS server.
Then they run fsck which ended with huge number of errors, the repair was unsuccessful. It ended with recreation of whole Lustre FS and restore from backup.
Is it possible to determine the root cause from logs?

Log fragment of node where MDS was stopped:

Jun 17 22:52:44 sklusp01b clurgmgrd[15765]: <notice> Stopping service service:l1mdt
Jun 17 22:52:44 sklusp01b kernel: Lustre: Failing over l1-MDT0000
Jun 17 22:52:45 sklusp01b kernel: LustreError: 26561:0:(handler.c:1512:mds_handle()) operation 400 on unconnected MDS from 12345-10.214.127.76 at tcp
Jun 17 22:52:45 sklusp01b kernel: LustreError: 26561:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-107)  req at ffff810cb54c7400 x1405018880115672/t0 o400-><?>@<?>:0/0 lens 192/0 e 0 to 0 dl 1339966381 ref 1 fl Interpret:H/0/0 rc -107/0
Jun 17 22:52:45 sklusp01b kernel: LustreError: 26561:0:(ldlm_lib.c:1919:target_send_reply_msg()) Skipped 5 previous similar messages
Jun 17 22:52:45 sklusp01b kernel: LustreError: 137-5: UUID 'l1-MDT0000_UUID' is not available  for connect (stopping)
Jun 17 22:52:46 sklusp01b kernel: LustreError: 26556:0:(handler.c:1512:mds_handle()) operation 400 on unconnected MDS from 12345-10.214.127.68 at tcp
Jun 17 22:52:46 sklusp01b kernel: LustreError: 26556:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-107)  req at ffff810f26104450 x1405018911601205/t0 o400-><?>@<?>:0/0 lens 192/0 e 0 to 0 dl 1339966382 ref 1 fl Interpret:H/0/0 rc -107/0
Jun 17 22:52:46 sklusp01b kernel: LustreError: 26556:0:(ldlm_lib.c:1919:target_send_reply_msg()) Skipped 1 previous similar message
Jun 17 22:52:46 sklusp01b kernel: LustreError: 137-5: UUID 'l1-MDT0000_UUID' is not available  for connect (stopping)
Jun 17 22:52:46 sklusp01b kernel: LustreError: 26559:0:(handler.c:1512:mds_handle()) operation 41 on unconnected MDS from 12345-10.214.127.64 at tcp
Jun 17 22:52:46 sklusp01b kernel: LustreError: 26559:0:(handler.c:1512:mds_handle()) Skipped 1 previous similar message
Jun 17 22:52:46 sklusp01b kernel: LustreError: 137-5: UUID 'l1-MDT0000_UUID' is not available  for connect (stopping)
Jun 17 22:52:46 sklusp01b kernel: LustreError: Skipped 1 previous similar message
Jun 17 22:52:47 sklusp01b kernel: LustreError: 23729:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-19)  req at ffff810e04fff400 x1405018875930701/t0 o38-><?>@<?>:0/0 lens 368/0 e 0 to 0 dl 1339966467 ref 1 fl Interpret:/0/0 rc -19/0
Jun 17 22:52:47 sklusp01b kernel: LustreError: 23729:0:(ldlm_lib.c:1919:target_send_reply_msg()) Skipped 8 previous similar messages
Jun 17 22:52:47 sklusp01b kernel: LustreError: 26533:0:(handler.c:1512:mds_handle()) operation 41 on unconnected MDS from 12345-10.214.127.61 at tcp
Jun 17 22:52:47 sklusp01b kernel: LustreError: 26533:0:(handler.c:1512:mds_handle()) Skipped 3 previous similar messages
Jun 17 22:52:47 sklusp01b kernel: LustreError: 137-5: UUID 'l1-MDT0000_UUID' is not available  for connect (stopping)
Jun 17 22:52:47 sklusp01b kernel: LustreError: Skipped 3 previous similar messages
Jun 17 22:52:48 sklusp01b kernel: LustreError: 26658:0:(ldlm_request.c:1025:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway
Jun 17 22:52:48 sklusp01b kernel: LustreError: 26658:0:(ldlm_request.c:1583:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108
Jun 17 22:52:48 sklusp01b kernel: Lustre: Failing over l1-OST0005-osc
Jun 17 22:52:48 sklusp01b kernel: Lustre: l1-MDT0000: shutting down for failover; client state will be preserved.
Jun 17 22:52:49 sklusp01b kernel: LustreError: 26560:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-107)  req at ffff810ee43af450 x1405018398825807/t0 o41-><?>@<?>:0/0 lens 192/0 e 0 to 0 dl 1339966375 ref 1 fl Interpret:/0/0 rc -107/0
Jun 17 22:52:49 sklusp01b kernel: LustreError: 26560:0:(ldlm_lib.c:1919:target_send_reply_msg()) Skipped 30 previous similar messages
Jun 17 22:52:49 sklusp01b kernel: Lustre: MDT l1-MDT0000 has stopped.
Jun 17 22:52:49 sklusp01b kernel: Lustre: MGS has stopped.
Jun 17 22:52:49 sklusp01b multipathd: dm-11: umount map (uevent)
Jun 17 22:52:50 sklusp01b kernel: LustreError: 23734:0:(handler.c:1512:mds_handle()) operation 41 on unconnected MDS from 12345-10.214.127.197 at tcp
Jun 17 22:52:50 sklusp01b kernel: LustreError: 23734:0:(handler.c:1512:mds_handle()) Skipped 19 previous similar messages
Jun 17 22:52:50 sklusp01b kernel: LustreError: 137-5: UUID 'l1-MDT0000_UUID' is not available  for connect (no target)
Jun 17 22:52:50 sklusp01b kernel: LustreError: Skipped 19 previous similar messages
Jun 17 22:52:51 sklusp01b kernel: Lustre: server umount l1-MDT0000 complete
Jun 17 22:52:51 sklusp01b clurgmgrd[15765]: <notice> Service service:l1mdt is stopped
Jun 17 22:52:53 sklusp01b clurgmgrd[15765]: <notice> Service service:l1mdt is now running on member 1
Jun 18 00:11:22 sklusp01b openais[15600]: [TOTEM] The token was lost in the OPERATIONAL state.

Log fragment of node where MDS was started:

Jun 17 22:52:51 sklusp01a clurgmgrd[1649]: <notice> Starting stopped service service:l1mdt
Jun 17 22:52:52 sklusp01a kernel: Lustre: OBD class driver, http://www.lustre.org/
Jun 17 22:52:52 sklusp01a kernel: Lustre:     Lustre Version: 1.8.5
Jun 17 22:52:52 sklusp01a kernel: Lustre:     Build Version: 1.8.5-20101116203234-PRISTINE-2.6.18-194.17.1.el5_lustre.1.8.5
Jun 17 22:52:52 sklusp01a kernel: Lustre: Added LNI 10.214.127.54 at tcp [8/256/0/180]
Jun 17 22:52:52 sklusp01a kernel: Lustre: Accept secure, port 988
Jun 17 22:52:52 sklusp01a kernel: Lustre: Lustre Client File System; http://www.lustre.org/
Jun 17 22:52:52 sklusp01a kernel: init dynlocks cache
Jun 17 22:52:52 sklusp01a kernel: ldiskfs created from ext3-2.6-rhel5
Jun 17 22:52:52 sklusp01a kernel: kjournald starting.  Commit interval 5 seconds
Jun 17 22:52:52 sklusp01a kernel: LDISKFS-fs warning: checktime reached, running e2fsck is recommended
Jun 17 22:52:52 sklusp01a kernel: LDISKFS FS on dm-11, internal journal
Jun 17 22:52:52 sklusp01a kernel: LDISKFS-fs: mounted filesystem with ordered data mode.
Jun 17 22:52:52 sklusp01a multipathd: dm-11: umount map (uevent)
Jun 17 22:52:52 sklusp01a kernel: kjournald starting.  Commit interval 5 seconds
Jun 17 22:52:52 sklusp01a kernel: LDISKFS-fs warning: checktime reached, running e2fsck is recommended
Jun 17 22:52:52 sklusp01a kernel: LDISKFS FS on dm-11, internal journal
Jun 17 22:52:52 sklusp01a kernel: LDISKFS-fs: mounted filesystem with ordered data mode.
Jun 17 22:52:52 sklusp01a kernel: Lustre: MGS MGS started
Jun 17 22:52:52 sklusp01a kernel: Lustre: MGC10.214.127.54 at tcp: Reactivating import
Jun 17 22:52:52 sklusp01a kernel: Lustre: Enabling user_xattr
Jun 17 22:52:52 sklusp01a kernel: Lustre: Enabling ACL
Jun 17 22:52:52 sklusp01a kernel: Lustre: 11216:0:(mds_fs.c:677:mds_init_server_data()) RECOVERY: service l1-MDT0000, 56 recoverable clients, 0 delayed clients, last_transno 133173826553
Jun 17 22:52:52 sklusp01a kernel: Lustre: l1-MDT0000: Now serving l1-MDT0000 on /dev/vgl1mdt/lvol1 with recovery enabled
Jun 17 22:52:52 sklusp01a kernel: Lustre: l1-MDT0000: Will be in recovery for at least 5:00, or until 56 clients reconnect
Jun 17 22:52:52 sklusp01a kernel: Lustre: 11216:0:(lproc_mds.c:271:lprocfs_wr_group_upcall()) l1-MDT0000: group upcall set to /usr/sbin/l_getgroups
Jun 17 22:52:52 sklusp01a kernel: Lustre: l1-MDT0000.mdt: set parameter group_upcall=/usr/sbin/l_getgroups
Jun 17 22:52:52 sklusp01a kernel: Lustre: 11216:0:(mds_lov.c:1155:mds_notify()) MDS l1-MDT0000: add target l1-OST0000_UUID
Jun 17 22:52:52 sklusp01a kernel: Lustre: 11216:0:(mds_lov.c:1155:mds_notify()) MDS l1-MDT0000: add target l1-OST0001_UUID
Jun 17 22:52:52 sklusp01a kernel: Lustre: 11000:0:(mds_lov.c:1191:mds_notify()) MDS l1-MDT0000: in recovery, not resetting orphans on l1-OST0001_UUID
Jun 17 22:52:53 sklusp01a kernel: Lustre: 11000:0:(mds_lov.c:1191:mds_notify()) MDS l1-MDT0000: in recovery, not resetting orphans on l1-OST0003_UUID
Jun 17 22:52:53 sklusp01a kernel: LustreError: 11221:0:(llog_lvfs.c:612:llog_lvfs_create()) error looking up logfile 0x4e1f1f3:0xf64cbe48: rc -2
Jun 17 22:52:53 sklusp01a kernel: LustreError: 11221:0:(llog_cat.c:172:llog_cat_id2handle()) error opening log id 0x4e1f1f3:f64cbe48: rc -2
Jun 17 22:52:53 sklusp01a kernel: LustreError: 11221:0:(llog_obd.c:291:cat_cancel_cb()) Cannot find handle for log 0x4e1f1f3
Jun 17 22:52:53 sklusp01a clurgmgrd[1649]: <notice> Service service:l1mdt started
Jun 17 22:52:57 sklusp01a kernel: Lustre: 11000:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request x1405056578486279 sent from l1-OST0000-osc to NID 10.214.127.55 at tcp 5s ago has timed out (5s prior to deadline).
Jun 17 22:52:57 sklusp01a kernel:   req at ffff8111def0ec00 x1405056578486279/t0 o8->l1-OST0000_UUID at 10.214.127.55@tcp:28/4 lens 368/584 e 0 to 1 dl 1339966377 ref 1 fl Rpc:N/0/0 rc 0/0
Jun 17 22:52:57 sklusp01a kernel: Lustre: 11000:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request x1405056578486281 sent from l1-OST0002-osc to NID 10.214.127.56 at tcp 5s ago has timed out (5s prior to deadline).
Jun 17 22:52:57 sklusp01a kernel:   req at ffff8111ef4cc400 x1405056578486281/t0 o8->l1-OST0002_UUID at 10.214.127.56@tcp:28/4 lens 368/584 e 0 to 1 dl 1339966377 ref 1 fl Rpc:N/0/0 rc 0/0
Jun 17 22:52:58 sklusp01a kernel: Lustre: 11000:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request x1405056578486283 sent from l1-OST0004-osc to NID 10.214.127.57 at tcp 5s ago has timed out (5s prior to deadline).
Jun 17 22:52:58 sklusp01a kernel:   req at ffff8111d3cb5400 x1405056578486283/t0 o8->l1-OST0004_UUID at 10.214.127.57@tcp:28/4 lens 368/584 e 0 to 1 dl 1339966378 ref 1 fl Rpc:N/0/0 rc 0/0
Jun 17 22:52:58 sklusp01a kernel: Lustre: 11000:0:(mds_lov.c:1191:mds_notify()) MDS l1-MDT0000: in recovery, not resetting orphans on l1-OST0002_UUID
Jun 17 22:52:58 sklusp01a kernel: Lustre: 11000:0:(mds_lov.c:1191:mds_notify()) Skipped 1 previous similar message
Jun 17 22:53:07 sklusp01a kernel: Lustre: 11164:0:(ldlm_lib.c:1815:target_queue_last_replay_reply()) l1-MDT0000: 55 recoverable clients remain
Jun 17 22:53:07 sklusp01a kernel: Lustre: 11155:0:(mds_open.c:895:mds_open_by_fid()) Orphan 1a6fb00:f64cdf9a found and opened in PENDING directory
Jun 17 22:53:07 sklusp01a kernel: Lustre: 11171:0:(mds_open.c:895:mds_open_by_fid()) Orphan 1a6f455:f64baabd found and opened in PENDING directory
Jun 17 22:53:08 sklusp01a kernel: Lustre: 11171:0:(ldlm_lib.c:1815:target_queue_last_replay_reply()) l1-MDT0000: 54 recoverable clients remain
Jun 17 22:53:08 sklusp01a kernel: Lustre: 11180:0:(ldlm_lib.c:1815:target_queue_last_replay_reply()) l1-MDT0000: 52 recoverable clients remain
Jun 17 22:53:08 sklusp01a kernel: Lustre: 11180:0:(ldlm_lib.c:1815:target_queue_last_replay_reply()) Skipped 1 previous similar message
Jun 17 22:53:10 sklusp01a kernel: Lustre: 11161:0:(ldlm_lib.c:1815:target_queue_last_replay_reply()) l1-MDT0000: 46 recoverable clients remain
Jun 17 22:53:10 sklusp01a kernel: Lustre: 11161:0:(ldlm_lib.c:1815:target_queue_last_replay_reply()) Skipped 5 previous similar messages
Jun 17 22:53:12 sklusp01a kernel: Lustre: 11172:0:(ldlm_lib.c:1815:target_queue_last_replay_reply()) l1-MDT0000: 26 recoverable clients remain
Jun 17 22:53:12 sklusp01a kernel: Lustre: 11172:0:(ldlm_lib.c:1815:target_queue_last_replay_reply()) Skipped 19 previous similar messages
Jun 17 22:53:21 sklusp01a kernel: Lustre: 11168:0:(ldlm_lib.c:1815:target_queue_last_replay_reply()) l1-MDT0000: 17 recoverable clients remain
Jun 17 22:53:21 sklusp01a kernel: Lustre: 11168:0:(ldlm_lib.c:1815:target_queue_last_replay_reply()) Skipped 8 previous similar messages
Jun 17 22:53:33 sklusp01a kernel: Lustre: 11180:0:(ldlm_lib.c:1815:target_queue_last_replay_reply()) l1-MDT0000: 2 recoverable clients remain
Jun 17 22:53:33 sklusp01a kernel: Lustre: 11180:0:(ldlm_lib.c:1815:target_queue_last_replay_reply()) Skipped 14 previous similar messages
Jun 17 23:03:07 sklusp01a kernel: LustreError: 11180:0:(handler.c:1512:mds_handle()) operation 101 on unconnected MDS from 12345-10.214.127.88 at tcp
Jun 17 23:03:07 sklusp01a kernel: LustreError: 11180:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-107)  req at ffff8111da988000 x1405018449265568/t0 o101-><?>@<?>:0/0 lens 512/0 e 0 to 0 dl 1339967029 ref 1 fl Interpret:/4/0 rc -107/0
Jun 17 23:03:07 sklusp01a kernel: LustreError: 11157:0:(ldlm_lib.c:944:target_handle_connect()) l1-MDT0000: denying connection for new client 10.214.127.88 at tcp (c67b1cbf-4ed5-dafb-323a-0164138a6efb): 1 clients in recovery for 0s
Jun 17 23:03:07 sklusp01a kernel: LustreError: 11175:0:(handler.c:1512:mds_handle()) operation 101 on unconnected MDS from 12345-10.214.127.216 at tcp
Jun 17 23:03:07 sklusp01a kernel: LustreError: 11175:0:(handler.c:1512:mds_handle()) Skipped 1 previous similar message
Jun 17 23:03:07 sklusp01a kernel: LustreError: 11180:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-107)  req at ffff8111f8b18000 x1402389212741079/t0 o101-><?>@<?>:0/0 lens 296/0 e 0 to 0 dl 1339967029 ref 1 fl Interpret:/0/0 rc -107/0
Jun 17 23:03:07 sklusp01a kernel: LustreError: 11180:0:(ldlm_lib.c:1919:target_send_reply_msg()) Skipped 1 previous similar message
Jun 17 23:03:08 sklusp01a kernel: LustreError: 11155:0:(handler.c:1512:mds_handle()) operation 101 on unconnected MDS from 12345-10.214.127.216 at tcp
Jun 17 23:03:08 sklusp01a kernel: LustreError: 11175:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-107)  req at ffff8111bf5cc800 x1402389212742543/t0 o101-><?>@<?>:0/0 lens 296/0 e 0 to 0 dl 1339967030 ref 1 fl Interpret:/0/0 rc -107/0
Jun 17 23:03:08 sklusp01a kernel: LustreError: 11163:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-107)  req at ffff8111bfd6cc00 x1402389212742542/t0 o101-><?>@<?>:0/0 lens 296/0 e 0 to 0 dl 1339967030 ref 1 fl Interpret:/0/0 rc -107/0
Jun 17 23:03:08 sklusp01a kernel: LustreError: 11175:0:(ldlm_lib.c:1919:target_send_reply_msg()) Skipped 1462 previous similar messages
Jun 17 23:03:08 sklusp01a kernel: LustreError: 11163:0:(ldlm_lib.c:1919:target_send_reply_msg()) Skipped 1462 previous similar messages
Jun 17 23:03:08 sklusp01a kernel: LustreError: 11155:0:(handler.c:1512:mds_handle()) Skipped 1733 previous similar messages
Jun 17 23:03:08 sklusp01a kernel: LDISKFS-fs error (device dm-11): ldiskfs_lookup: unlinked inode 27720411 in dir #29287441
Jun 17 23:03:08 sklusp01a kernel: Remounting filesystem read-only
Jun 17 23:03:08 sklusp01a kernel: Lustre: 11174:0:(mds_unlink_open.c:324:mds_cleanup_pending()) l1-MDT0000: removed 2 pending open-unlinked files
Jun 17 23:03:08 sklusp01a kernel: Lustre: l1-MDT0000: Post recovery failed, rc -2
Jun 17 23:03:08 sklusp01a kernel: Lustre: l1-MDT0000: Recovery period over after 10:01, of 56 clients 54 recovered and 2 were evicted.
Jun 17 23:03:08 sklusp01a kernel: Lustre: l1-MDT0000: sending delayed replies to recovered clients
Jun 17 23:03:08 sklusp01a kernel: LustreError: 11171:0:(fsfilt-ldiskfs.c:366:fsfilt_ldiskfs_start()) error starting handle for op 8 (71 credits): rc -30
Jun 17 23:03:08 sklusp01a kernel: LustreError: 11155:0:(fsfilt-ldiskfs.c:366:fsfilt_ldiskfs_start()) error starting handle for op 8 (71 credits): rc -30
Jun 17 23:03:08 sklusp01a kernel: LustreError: 11155:0:(mds_reint.c:251:mds_finish_transno()) fsfilt_start: -30
Jun 17 23:03:08 sklusp01a kernel: LustreError: 11170:0:(mds_reint.c:251:mds_finish_transno()) fsfilt_start: -30
Jun 17 23:03:08 sklusp01a kernel: LustreError: 11170:0:(mds_reint.c:251:mds_finish_transno()) Skipped 1 previous similar message
Jun 17 23:03:08 sklusp01a kernel: LustreError: 11171:0:(fsfilt-ldiskfs.c:366:fsfilt_ldiskfs_start()) Skipped 67 previous similar messages
Jun 17 23:03:08 sklusp01a kernel: LDISKFS-fs warning (device dm-11): kmmpd: kmmpd being stopped since filesystem has been remounted as readonly.
Jun 17 23:03:09 sklusp01a kernel: LustreError: 11149:0:(handler.c:1512:mds_handle()) operation 101 on unconnected MDS from 12345-10.214.127.216 at tcp
Jun 17 23:03:09 sklusp01a kernel: LustreError: 11149:0:(handler.c:1512:mds_handle()) Skipped 2611 previous similar messages
Jun 17 23:03:10 sklusp01a kernel: LustreError: 11151:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-107)  req at ffff8111c98d1400 x1402389212748277/t0 o101-><?>@<?>:0/0 lens 296/0 e 0 to 0 dl 1339967032 ref 1 fl Interpret:/0/0 rc -107/0
Jun 17 23:03:10 sklusp01a kernel: LustreError: 11151:0:(ldlm_lib.c:1919:target_send_reply_msg()) Skipped 5814 previous similar messages
Jun 17 23:03:11 sklusp01a kernel: LustreError: 11158:0:(handler.c:1512:mds_handle()) operation 101 on unconnected MDS from 12345-10.214.127.216 at tcp
Jun 17 23:03:11 sklusp01a kernel: LustreError: 11158:0:(handler.c:1512:mds_handle()) Skipped 5620 previous similar messages
Jun 17 23:03:13 sklusp01a kernel: LustreError: 11134:0:(llog_lvfs.c:577:llog_filp_open()) logfile creation CONFIGS/l1-client: -30
Jun 17 23:03:13 sklusp01a kernel: LustreError: 11134:0:(mgs_handler.c:672:mgs_handle()) MGS handle cmd=501 rc=-30
Jun 17 23:03:14 sklusp01a kernel: LustreError: 11178:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-107)  req at ffff8111cc713800 x1402389212759445/t0 o101-><?>@<?>:0/0 lens 296/0 e 0 to 0 dl 1339967036 ref 1 fl Interpret:/0/0 rc -107/0
Jun 17 23:03:14 sklusp01a kernel: LustreError: 11178:0:(ldlm_lib.c:1919:target_send_reply_msg()) Skipped 11117 previous similar messages
Jun 17 23:03:14 sklusp01a kernel: LustreError: 11156:0:(fsfilt-ldiskfs.c:366:fsfilt_ldiskfs_start()) error starting handle for op 8 (71 credits): rc -30
Jun 17 23:03:14 sklusp01a kernel: LustreError: 11156:0:(fsfilt-ldiskfs.c:366:fsfilt_ldiskfs_start()) Skipped 972 previous similar messages
Jun 17 23:03:14 sklusp01a kernel: LustreError: 11156:0:(mds_fs.c:236:mds_client_add()) unable to start transaction: rc -30
Jun 17 23:03:15 sklusp01a kernel: LustreError: 13214:0:(handler.c:1512:mds_handle()) operation 101 on unconnected MDS from 12345-10.214.127.216 at tcp
Jun 17 23:03:15 sklusp01a kernel: LustreError: 13214:0:(handler.c:1512:mds_handle()) Skipped 11051 previous similar messages
Jun 17 23:03:22 sklusp01a kernel: LustreError: 13217:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-107)  req at ffff8111f9c7d800 x1402389212780542/t0 o101-><?>@<?>:0/0 lens 296/0 e 0 to 0 dl 1339967044 ref 1 fl Interpret:/0/0 rc -107/0
Jun 17 23:03:22 sklusp01a kernel: LustreError: 13217:0:(ldlm_lib.c:1919:target_send_reply_msg()) Skipped 21051 previous similar messages

Thanks for any hints,
Akos Kalosi

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20120620/ed702221/attachment.htm>


More information about the lustre-discuss mailing list