<div dir="ltr">Hello Stephane,<div><br></div><div>Thanks for your feedback.</div><div><br></div><div>Why did you run e2fsck?</div><div>I was suspecting some errors but the e2fsck didn't see anything <br>Did e2fsck fix something?</div><div>no<br>What version of e2fsprogs are you using?<br></div><div>e2fsprogs-1.46.2.wc3-0.el7.x86_64<br></div><div><br></div><div>The device had no free i-nodes anymore</div><div>so I mounted the device with mount -t ldiskfs mdtdevice /mnt to be able to free up some space.</div><div>But after we still could not mount the mdt</div><div><br></div><div>Mar 6 11:23:51 mds01 kernel: LDISKFS-fs (dm-19): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc<br>Mar 6 11:23:52 mds01 kernel: LustreError: 11-0: data-MDT0001-osp-MDT0000: operation mds_connect to node 0@lo failed: rc = -114<br>Mar 6 11:23:52 mds01 kernel: LustreError: Skipped 9 previous similar messages<br>Mar 6 11:23:52 mds01 kernel: LustreError: 79765:0:(genops.c:556:class_register_device()) data-OST0000-osc-MDT0000: already exists, won't add<br>Mar 6 11:23:52 mds01 kernel: LustreError: 79765:0:(obd_config.c:1835:class_config_llog_handler()) MGC1@tcp14: cfg command failed: rc = -17<br>Mar 6 11:23:52 mds01 kernel: Lustre: cmd=cf001 0:data-OST0000-osc-MDT0000 1:osp 2:data-MDT0000-mdtlov_UUID <br>Mar 6 11:23:52 mds01 kernel: LustreError: 15c-8: MGC@tcp14: The configuration from log 'data-MDT0000' failed (-17). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.<br>Mar 6 11:23:52 mds01 kernel: LustreError: 79753:0:(obd_mount_server.c:1397:server_start_targets()) failed to start server data-MDT0000: -17<br>Mar 6 11:23:52 mds01 kernel: LustreError: 79753:0:(obd_mount_server.c:1992:server_fill_super()) Unable to start targets: -17<br>Mar 6 11:23:52 mds01 kernel: Lustre: Failing over data-MDT0000<br>Mar 6 11:23:52 mds01 kernel: Lustre: data-MDT0000: Not available for connect from @o2ib4 (stopping)<br>Mar 6 11:23:53 mds01 kernel: Lustre: server umount data-MDT0000 complete<br>Mar 6 11:23:53 mds01 kernel: LustreError: 79753:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount (-17)<br></div><div><br></div><div>Robin</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Mar 5, 2023 at 2:07 AM Stephane Thiell <<a href="mailto:sthiell@stanford.edu">sthiell@stanford.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Robin,<br>
<br>
Sorry to hear about your problem.<br>
<br>
A few questions…<br>
<br>
Why did you run e2fsck?<br>
Did e2fsck fix something?<br>
What version of e2fsprogs are you using?<br>
<br>
errno 28 is ENOSPC, what does dumpe2fs say about available space?<br>
<br>
You can check the values of "Free blocks” and "Free inodes” using this command:<br>
<br>
dumpe2fs -h /dev/mapper/****-MDT0000<br>
<br>
<br>
Best,<br>
Stephane<br>
<br>
<br>
> On Mar 2, 2023, at 2:08 AM, Teeninga, Robin via lustre-discuss <<a href="mailto:lustre-discuss@lists.lustre.org" target="_blank">lustre-discuss@lists.lustre.org</a>> wrote:<br>
> <br>
> Hello,<br>
> <br>
> I've did an e2fsck on my MDT and after that I could not mount the MDT anymore<br>
> It gives me this error when I've tried to mount the filesystem<br>
> any ideas how to resolve this?<br>
> <br>
> We are running Lustre server 2.12.7 on CentOS 7.9<br>
> mount.lustre: mount /dev/mapper/****-MDT0000 at /lustre/****-MDT0000 failed: File exists<br>
> <br>
> <br>
> Mar 2 10:58:35 mds01 kernel: LDISKFS-fs (dm-19): mounted filesystem with ordered mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc<br>
> Mar 2 10:58:35 mds01 kernel: LustreError: 160060:0:(llog.c:1398:llog_backup()) MGC****@tcp14: failed to open backup logfile ****-MDT0000T: rc = -28<br>
> Mar 2 10:58:35 mds01 kernel: LustreError: 160060:0:(mgc_request.c:1879:mgc_llog_local_copy()) MGC****@tcp14: failed to copy remote log ****-MDT0000: rc = -28<br>
> Mar 2 10:58:35 mds01 kernel: LustreError: 137-5: ****-MDT0001_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server.<br>
> Mar 2 10:58:35 mds01 kernel: LustreError: Skipped 4 previous similar messages<br>
> Mar 2 10:58:35 mds01 kernel: LustreError: 160127:0:(genops.c:556:class_register_device()) *****-OST0000-osc-MDT0000: already exists, won't add<br>
> Mar 2 10:58:35 mds01 kernel: LustreError: 160127:0:(obd_config.c:1835:class_config_llog_handler()) MGC****@tcp14: cfg command failed: rc = -17<br>
> Mar 2 10:58:36 mds01 kernel: Lustre: cmd=cf001 0:****-OST0000-osc-MDT0000 1:osp 2:****-MDT0000-mdtlov_UUID <br>
> Mar 2 10:58:36 mds01 kernel: LustreError: 15c-8: MGC****@tcp14: The configuration from log '****-MDT0000' failed (-17). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.<br>
> Mar 2 10:58:36 mds01 kernel: LustreError: 160060:0:(obd_mount_server.c:1397:server_start_targets()) failed to start server ****-MDT0000: -17<br>
> Mar 2 10:58:36 mds01 kernel: LustreError: 160060:0:(obd_mount_server.c:1992:server_fill_super()) Unable to start targets: -17<br>
> Mar 2 10:58:36 mds01 kernel: Lustre: Failing over ****-MDT0000<br>
> Mar 2 10:58:37 mds01 kernel: Lustre: server umount ****-MDT0000 complete<br>
> Mar 2 10:58:37 mds01 kernel: LustreError: 160060:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount (-17)<br>
> <br>
> <br>
> Regards,<br>
> <br>
> Robin<br>
> _______________________________________________<br>
> lustre-discuss mailing list<br>
> <a href="mailto:lustre-discuss@lists.lustre.org" target="_blank">lustre-discuss@lists.lustre.org</a><br>
> <a href="http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org" rel="noreferrer" target="_blank">http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org</a><br>
<br>
</blockquote></div>