[lustre-discuss] MDT will not mount

Hans Henrik Happe happe at nbi.dk
Thu Mar 10 03:23:12 PST 2022


After upgrading to Lustre 2.12.8 I found that the first mount after a 
reboot behaves differently:

Mounting mds02/astro0 on /mnt/lustre/local/astro-MDT0000
mount.lustre: mount mds02/astro0 at /mnt/lustre/local/astro-MDT0000 
failed: No space left on device

And a different syslog output (attached syslog-0).

Doing the mount again has this error:

Mounting mds02/astro0 on /mnt/lustre/local/astro-MDT0000
mount.lustre: mount mds02/astro0 at /mnt/lustre/local/astro-MDT0000 
failed: File exists

And a syslog like the one first posted. Attached the new output in syslog-1.

Finally, stopping Lustre (Only MGS in this case) and the lnet service 
does free resources making lustre_rmmod fail:

# lustre_rmmod
rmmod: ERROR: Module osp is in use


Cheers,
Hans Henrik

On 10.03.2022 11.15, Hans Henrik Happe via lustre-discuss wrote:
> Forgot to say this is Lustre 2.12.6 and CentOS 7.9 
> (3.10.0-1160.6.1.el7.x86_64).
>
> On 10.03.2022 10.27, Hans Henrik Happe via lustre-discuss wrote:
>> Hi,
>>
>> A reboot of the MDS stalled and got forced reset. After that the MDS 
>> would not start. The syslog is attached.
>>
>> I'm not sure what the "class_register_device()) 
>> astro-OST0002-osc-MDT0000" part is supposed to do but astro-OST0002 
>> is not mounted at this time. I guess this comes from the MGS.
>>
>> Cheers,
>> Hans Henrik
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20220310/056ab1c3/attachment.html>
-------------- next part --------------
Mar 10 12:08:15 mds02 kernel: Lustre: MGS: Connection restored to 3be12548-8d1b-39d8-1ec0-0381833f8bc2 (at 172.20.200.30 at tcp1)
Mar 10 12:08:15 mds02 kernel: Lustre: Skipped 42 previous similar messages
Mar 10 12:08:33 mds02 kernel: Lustre: 5191:0:(llog_cat.c:93:llog_cat_new_log()) astro-OST0002-osc-MDT0000: there are no more free slots in catalog [0x2:0x1:0x0]:0
Mar 10 12:08:33 mds02 kernel: LustreError: 5191:0:(osp_sync.c:1524:osp_sync_init()) astro-OST0002-osc-MDT0000: can't initialize llog: rc = -28
Mar 10 12:08:33 mds02 kernel: LustreError: 5191:0:(obd_config.c:559:class_setup()) setup astro-OST0002-osc-MDT0000 failed (-28)
Mar 10 12:08:33 mds02 kernel: LustreError: 5191:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.21.10.102 at o2ib: cfg command failed: rc = -28
Mar 10 12:08:33 mds02 kernel: Lustre:    cmd=cf003 0:astro-OST0002-osc-MDT0000  1:astro-OST0002_UUID  2:172.21.10.116 at tcp  
Mar 10 12:08:33 mds02 kernel: LustreError: 15c-8: MGC10.21.10.102 at o2ib: The configuration from log 'astro-MDT0000' failed (-28). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
Mar 10 12:08:33 mds02 kernel: LustreError: 5131:0:(obd_mount_server.c:1397:server_start_targets()) failed to start server astro-MDT0000: -28
Mar 10 12:08:33 mds02 kernel: LustreError: 5131:0:(obd_mount_server.c:1992:server_fill_super()) Unable to start targets: -28
Mar 10 12:08:33 mds02 kernel: Lustre: Failing over astro-MDT0000
Mar 10 12:08:33 mds02 kernel: Lustre: server umount astro-MDT0000 complete
Mar 10 12:08:33 mds02 kernel: LustreError: 5131:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount  (-28)

-------------- next part --------------
Mar 10 12:10:56 mds02 kernel: LustreError: 5622:0:(genops.c:556:class_register_device()) astro-OST0002-osc-MDT0000: already exists, won't add
Mar 10 12:10:56 mds02 kernel: LustreError: 5622:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.21.10.102 at o2ib: cfg command failed: rc = -17
Mar 10 12:10:56 mds02 kernel: Lustre:    cmd=cf001 0:astro-OST0002-osc-MDT0000  1:osp  2:astro-MDT0000-mdtlov_UUID  
Mar 10 12:10:56 mds02 kernel: LustreError: 15c-8: MGC10.21.10.102 at o2ib: The configuration from log 'astro-MDT0000' failed (-17). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
Mar 10 12:10:56 mds02 kernel: LustreError: 5566:0:(obd_mount_server.c:1397:server_start_targets()) failed to start server astro-MDT0000: -17
Mar 10 12:10:56 mds02 kernel: LustreError: 5566:0:(obd_mount_server.c:1992:server_fill_super()) Unable to start targets: -17
Mar 10 12:10:56 mds02 kernel: Lustre: Failing over astro-MDT0000
Mar 10 12:10:56 mds02 kernel: Lustre: server umount astro-MDT0000 complete
Mar 10 12:10:56 mds02 kernel: LustreError: 5566:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount  (-17)



More information about the lustre-discuss mailing list