[lustre-discuss] MDT will not mount

Hans Henrik Happe happe at nbi.dk
Thu Mar 10 01:27:01 PST 2022


Hi,

A reboot of the MDS stalled and got forced reset. After that the MDS 
would not start. The syslog is attached.

I'm not sure what the "class_register_device()) 
astro-OST0002-osc-MDT0000" part is supposed to do but astro-OST0002 is 
not mounted at this time. I guess this comes from the MGS.

Cheers,
Hans Henrik
-------------- next part --------------
Mar 10 10:03:49 mds02 kernel: Lustre: MGS: Connection restored to d8787407-db0d-ccfb-e5ab-adeb41b86c1d (at 0 at lo)
Mar 10 10:03:49 mds02 kernel: Lustre: Skipped 197 previous similar messages
Mar 10 10:03:59 mds02 kernel: LustreError: 137-5: astro-MDT0000_UUID: not available for connect from 10.21.207.78 at o2ib (no target). If you are running an HA pair check that the target is mounted on the other server.
Mar 10 10:03:59 mds02 kernel: LustreError: Skipped 155 previous similar messages
Mar 10 10:04:00 mds02 kernel: LustreError: 8923:0:(genops.c:556:class_register_device()) astro-OST0002-osc-MDT0000: already exists, won't add
Mar 10 10:04:00 mds02 kernel: LustreError: 8923:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.21.10.102 at o2ib: cfg command failed: rc = -17
Mar 10 10:04:00 mds02 kernel: Lustre:    cmd=cf001 0:astro-OST0002-osc-MDT0000  1:osp  2:astro-MDT0000-mdtlov_UUID  
Mar 10 10:04:00 mds02 kernel: LustreError: 15c-8: MGC10.21.10.102 at o2ib: The configuration from log 'astro-MDT0000' failed (-17). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
Mar 10 10:04:00 mds02 kernel: LustreError: 7016:0:(obd_mount_server.c:1397:server_start_targets()) failed to start server astro-MDT0000: -17
Mar 10 10:04:00 mds02 kernel: LustreError: 7016:0:(obd_mount_server.c:1992:server_fill_super()) Unable to start targets: -17
Mar 10 10:04:00 mds02 kernel: Lustre: Failing over astro-MDT0000
Mar 10 10:04:01 mds02 kernel: Lustre: astro-MDT0000: Not available for connect from 10.21.208.26 at o2ib (stopping)
Mar 10 10:04:01 mds02 kernel: Lustre: Skipped 129 previous similar messages
Mar 10 10:04:15 mds02 kernel: LustreError: 137-5: astro-MDT0000_UUID: not available for connect from 172.20.2.101 at tcp1 (no target). If you are running an HA pair check that the target is mounted on the other server.
Mar 10 10:04:15 mds02 kernel: LustreError: 137-5: astro-MDT0000_UUID: not available for connect from 172.20.2.101 at tcp1 (no target). If you are running an HA pair check that the target is mounted on the other server.
Mar 10 10:04:15 mds02 kernel: LustreError: Skipped 35 previous similar messages
Mar 10 10:04:15 mds02 kernel: LustreError: Skipped 1 previous similar message
Mar 10 10:04:20 mds02 kernel: Lustre: server umount astro-MDT0000 complete
Mar 10 10:04:20 mds02 kernel: LustreError: 7016:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount  (-17)
Mar 10 10:04:37 mds02 kernel: Lustre: MGS: Connection restored to  (at 10.21.207.58 at o2ib)



More information about the lustre-discuss mailing list