[lustre-discuss] Filesystem could not mount after e2fsck

Teeninga, Robin r.teeninga at rug.nl
Mon Mar 6 02:27:49 PST 2023


Hello Stephane,

Thanks for your feedback.

Why did you run e2fsck?
I was suspecting some errors but the e2fsck didn't see anything
Did e2fsck fix something?
no
What version of e2fsprogs are you using?
e2fsprogs-1.46.2.wc3-0.el7.x86_64

The device had no free i-nodes anymore
so I mounted the device with  mount -t ldiskfs mdtdevice /mnt to be able to
free up some space.
But after we still could not mount the mdt

Mar  6 11:23:51 mds01 kernel: LDISKFS-fs (dm-19): mounted filesystem with
ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
Mar  6 11:23:52 mds01 kernel: LustreError: 11-0: data-MDT0001-osp-MDT0000:
operation mds_connect to node 0 at lo failed: rc = -114
Mar  6 11:23:52 mds01 kernel: LustreError: Skipped 9 previous similar
messages
Mar  6 11:23:52 mds01 kernel: LustreError:
79765:0:(genops.c:556:class_register_device()) data-OST0000-osc-MDT0000:
already exists, won't add
Mar  6 11:23:52 mds01 kernel: LustreError:
79765:0:(obd_config.c:1835:class_config_llog_handler()) MGC1 at tcp14: cfg
command failed: rc = -17
Mar  6 11:23:52 mds01 kernel: Lustre:    cmd=cf001
0:data-OST0000-osc-MDT0000  1:osp  2:data-MDT0000-mdtlov_UUID
Mar  6 11:23:52 mds01 kernel: LustreError: 15c-8: MGC at tcp14: The
configuration from log 'data-MDT0000' failed (-17). This may be the result
of communication errors between this node and the MGS, a bad configuration,
or other errors. See the syslog for more information.
Mar  6 11:23:52 mds01 kernel: LustreError:
79753:0:(obd_mount_server.c:1397:server_start_targets()) failed to start
server data-MDT0000: -17
Mar  6 11:23:52 mds01 kernel: LustreError:
79753:0:(obd_mount_server.c:1992:server_fill_super()) Unable to start
targets: -17
Mar  6 11:23:52 mds01 kernel: Lustre: Failing over data-MDT0000
Mar  6 11:23:52 mds01 kernel: Lustre: data-MDT0000: Not available for
connect from @o2ib4 (stopping)
Mar  6 11:23:53 mds01 kernel: Lustre: server umount data-MDT0000 complete
Mar  6 11:23:53 mds01 kernel: LustreError:
79753:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount  (-17)

Robin

On Sun, Mar 5, 2023 at 2:07 AM Stephane Thiell <sthiell at stanford.edu> wrote:

> Hi Robin,
>
> Sorry to hear about your problem.
>
> A few questions…
>
> Why did you run e2fsck?
> Did e2fsck fix something?
> What version of e2fsprogs are you using?
>
> errno 28 is ENOSPC, what does dumpe2fs say about available space?
>
> You can check the values of "Free blocks” and "Free inodes” using this
> command:
>
> dumpe2fs -h /dev/mapper/****-MDT0000
>
>
> Best,
> Stephane
>
>
> > On Mar 2, 2023, at 2:08 AM, Teeninga, Robin via lustre-discuss <
> lustre-discuss at lists.lustre.org> wrote:
> >
> > Hello,
> >
> > I've did an e2fsck on my MDT and after that I could not mount the MDT
> anymore
> > It gives me this error when I've tried to mount the filesystem
> > any ideas how to resolve this?
> >
> > We are running Lustre server 2.12.7 on CentOS 7.9
> > mount.lustre: mount /dev/mapper/****-MDT0000 at /lustre/****-MDT0000
> failed: File exists
> >
> >
> > Mar  2 10:58:35 mds01 kernel: LDISKFS-fs (dm-19): mounted filesystem
> with ordered  mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
> > Mar  2 10:58:35 mds01 kernel: LustreError:
> 160060:0:(llog.c:1398:llog_backup()) MGC****@tcp14: failed to open backup
> logfile ****-MDT0000T: rc = -28
> > Mar  2 10:58:35 mds01 kernel: LustreError:
> 160060:0:(mgc_request.c:1879:mgc_llog_local_copy()) MGC****@tcp14: failed
> to copy remote log ****-MDT0000: rc = -28
> > Mar  2 10:58:35 mds01 kernel: LustreError: 137-5: ****-MDT0001_UUID: not
> available for connect from 0 at lo (no target). If you are running an HA
> pair check that the target is mounted on the other server.
> > Mar  2 10:58:35 mds01 kernel: LustreError: Skipped 4 previous similar
> messages
> > Mar  2 10:58:35 mds01 kernel: LustreError:
> 160127:0:(genops.c:556:class_register_device()) *****-OST0000-osc-MDT0000:
> already exists, won't add
> > Mar  2 10:58:35 mds01 kernel: LustreError:
> 160127:0:(obd_config.c:1835:class_config_llog_handler()) MGC****@tcp14: cfg
> command failed: rc = -17
> > Mar  2 10:58:36 mds01 kernel: Lustre:    cmd=cf001
> 0:****-OST0000-osc-MDT0000  1:osp  2:****-MDT0000-mdtlov_UUID
> > Mar  2 10:58:36 mds01 kernel: LustreError: 15c-8: MGC****@tcp14: The
> configuration from log '****-MDT0000' failed (-17). This may be the result
> of communication errors between this node and the MGS, a bad configuration,
> or other errors. See the syslog for more information.
> > Mar  2 10:58:36 mds01 kernel: LustreError:
> 160060:0:(obd_mount_server.c:1397:server_start_targets()) failed to start
> server ****-MDT0000: -17
> > Mar  2 10:58:36 mds01 kernel: LustreError:
> 160060:0:(obd_mount_server.c:1992:server_fill_super()) Unable to start
> targets: -17
> > Mar  2 10:58:36 mds01 kernel: Lustre: Failing over ****-MDT0000
> > Mar  2 10:58:37 mds01 kernel: Lustre: server umount ****-MDT0000 complete
> > Mar  2 10:58:37 mds01 kernel: LustreError:
> 160060:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount  (-17)
> >
> >
> > Regards,
> >
> > Robin
> > _______________________________________________
> > lustre-discuss mailing list
> > lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20230306/24568417/attachment.htm>


More information about the lustre-discuss mailing list