[lustre-discuss] Filesystem could not mount after e2fsck

Stephane Thiell sthiell at stanford.edu
Mon Mar 6 13:50:46 PST 2023


Hi Robin,

Your MDT configuration file seems corrupt somehow.
Those "already exists, won't add” errors make me think about a ticket we opened a while back https://jira.whamcloud.com/browse/LU-15000
It was also with 2.12 but only because we had used lctl llog_cancel (or the newer lctl del_ost command). A patch is required on 2.12 to fix it.

At this point, to be able to mount, you could regenerate all config files by following the writeconf procedure detailed in the Lustre manual.

Good luck!

Stephane

On Mar 6, 2023, at 2:27 AM, Teeninga, Robin <r.teeninga at rug.nl<mailto:r.teeninga at rug.nl>> wrote:

Hello Stephane,

Thanks for your feedback.

Why did you run e2fsck?
I was suspecting some errors but the e2fsck didn't see anything
Did e2fsck fix something?
no
What version of e2fsprogs are you using?
e2fsprogs-1.46.2.wc3-0.el7.x86_64

The device had no free i-nodes anymore
so I mounted the device with  mount -t ldiskfs mdtdevice /mnt to be able to free up some space.
But after we still could not mount the mdt

Mar  6 11:23:51 mds01 kernel: LDISKFS-fs (dm-19): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
Mar  6 11:23:52 mds01 kernel: LustreError: 11-0: data-MDT0001-osp-MDT0000: operation mds_connect to node 0 at lo failed: rc = -114
Mar  6 11:23:52 mds01 kernel: LustreError: Skipped 9 previous similar messages
Mar  6 11:23:52 mds01 kernel: LustreError: 79765:0:(genops.c:556:class_register_device()) data-OST0000-osc-MDT0000: already exists, won't add
Mar  6 11:23:52 mds01 kernel: LustreError: 79765:0:(obd_config.c:1835:class_config_llog_handler()) MGC1 at tcp14: cfg command failed: rc = -17
Mar  6 11:23:52 mds01 kernel: Lustre:    cmd=cf001 0:data-OST0000-osc-MDT0000  1:osp  2:data-MDT0000-mdtlov_UUID
Mar  6 11:23:52 mds01 kernel: LustreError: 15c-8: MGC at tcp14: The configuration from log 'data-MDT0000' failed (-17). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
Mar  6 11:23:52 mds01 kernel: LustreError: 79753:0:(obd_mount_server.c:1397:server_start_targets()) failed to start server data-MDT0000: -17
Mar  6 11:23:52 mds01 kernel: LustreError: 79753:0:(obd_mount_server.c:1992:server_fill_super()) Unable to start targets: -17
Mar  6 11:23:52 mds01 kernel: Lustre: Failing over data-MDT0000
Mar  6 11:23:52 mds01 kernel: Lustre: data-MDT0000: Not available for connect from @o2ib4 (stopping)
Mar  6 11:23:53 mds01 kernel: Lustre: server umount data-MDT0000 complete
Mar  6 11:23:53 mds01 kernel: LustreError: 79753:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount  (-17)

Robin

On Sun, Mar 5, 2023 at 2:07 AM Stephane Thiell <sthiell at stanford.edu<mailto:sthiell at stanford.edu>> wrote:
Hi Robin,

Sorry to hear about your problem.

A few questions…

Why did you run e2fsck?
Did e2fsck fix something?
What version of e2fsprogs are you using?

errno 28 is ENOSPC, what does dumpe2fs say about available space?

You can check the values of "Free blocks” and "Free inodes” using this command:

dumpe2fs -h /dev/mapper/****-MDT0000


Best,
Stephane


> On Mar 2, 2023, at 2:08 AM, Teeninga, Robin via lustre-discuss <lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>> wrote:
>
> Hello,
>
> I've did an e2fsck on my MDT and after that I could not mount the MDT anymore
> It gives me this error when I've tried to mount the filesystem
> any ideas how to resolve this?
>
> We are running Lustre server 2.12.7 on CentOS 7.9
> mount.lustre: mount /dev/mapper/****-MDT0000 at /lustre/****-MDT0000 failed: File exists
>
>
> Mar  2 10:58:35 mds01 kernel: LDISKFS-fs (dm-19): mounted filesystem with ordered  mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
> Mar  2 10:58:35 mds01 kernel: LustreError: 160060:0:(llog.c:1398:llog_backup()) MGC****@tcp14: failed to open backup logfile ****-MDT0000T: rc = -28
> Mar  2 10:58:35 mds01 kernel: LustreError: 160060:0:(mgc_request.c:1879:mgc_llog_local_copy()) MGC****@tcp14: failed to copy remote log ****-MDT0000: rc = -28
> Mar  2 10:58:35 mds01 kernel: LustreError: 137-5: ****-MDT0001_UUID: not available for connect from 0 at lo (no target). If you are running an HA pair check that the target is mounted on the other server.
> Mar  2 10:58:35 mds01 kernel: LustreError: Skipped 4 previous similar messages
> Mar  2 10:58:35 mds01 kernel: LustreError: 160127:0:(genops.c:556:class_register_device()) *****-OST0000-osc-MDT0000: already exists, won't add
> Mar  2 10:58:35 mds01 kernel: LustreError: 160127:0:(obd_config.c:1835:class_config_llog_handler()) MGC****@tcp14: cfg command failed: rc = -17
> Mar  2 10:58:36 mds01 kernel: Lustre:    cmd=cf001 0:****-OST0000-osc-MDT0000  1:osp  2:****-MDT0000-mdtlov_UUID
> Mar  2 10:58:36 mds01 kernel: LustreError: 15c-8: MGC****@tcp14: The configuration from log '****-MDT0000' failed (-17). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
> Mar  2 10:58:36 mds01 kernel: LustreError: 160060:0:(obd_mount_server.c:1397:server_start_targets()) failed to start server ****-MDT0000: -17
> Mar  2 10:58:36 mds01 kernel: LustreError: 160060:0:(obd_mount_server.c:1992:server_fill_super()) Unable to start targets: -17
> Mar  2 10:58:36 mds01 kernel: Lustre: Failing over ****-MDT0000
> Mar  2 10:58:37 mds01 kernel: Lustre: server umount ****-MDT0000 complete
> Mar  2 10:58:37 mds01 kernel: LustreError: 160060:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount  (-17)
>
>
> Regards,
>
> Robin
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20230306/e1150711/attachment-0001.htm>


More information about the lustre-discuss mailing list