[lustre-discuss] OST is not mounting

Thomas Roth t.roth at gsi.de
Mon Nov 13 04:34:28 PST 2023


So, did you do the "writeconf"? And the OST mounted afterwards?

As I understand, the MGS was under the impression that this re-mounting 
OST was actually a new one using an old index.
So, what made your repaired OST look new/different ?
I would probably have mounted it locally, as an ext4 file system, if 
only to check that there is data still present (ok, "df" would do that, 
too).
"tunefs.lustre --dryrun"  will show other quantum numbers that _should 
not_ change when taking down and remounting an OST.

And since "writeconf" has to be done on all targets, you have to take 
down your MDS anyhow - so nothing is lost by simply trying an MDS restart?

Regards
Thomas

On 11/5/23 17:11, Backer via lustre-discuss wrote:
> Hi,
> 
> I am new to this email list. Looking to get some help on why an OST is 
> not getting mounted.
> 
> 
> The cluster was running healthy and the OST experienced an issue and 
> Linux re-mounted the OST read only. After fixing the issue and rebooting 
> the node multiple times, it wouldn't mount.
> 
> When the mount is done, the mount command errors out stating that that 
> the index is already in use. The index for the device is 33.  There is 
> no place where this index is mounted.
> 
> The debug message from the MGS during the mount is attached at the end 
> of this email. It is asking to use writeconf. After using writeconfig, 
> the device was mounted. Looking for a couple of things here.
> 
> - I am hoping that the writeconf method is the right thing to do here.
> - Why did OST become in this state after the write failure and was 
> mounted RO.  The write error was due to iSCSI target going offline and 
> coming back after a few seconds later.
> 
> 20000000:01000000:17.0:1698240468.758487:0:91492:0:(mgs_handler.c:496:mgs_target_reg()) updating fs1-OST0021, index=33
> 
> 20000000:00000001:17.0:1698240468.758488:0:91492:0:(mgs_llog.c:4403:mgs_write_log_target()) Process entered
> 
> 20000000:00000001:17.0:1698240468.758488:0:91492:0:(mgs_llog.c:671:mgs_set_index()) Process entered
> 
> 20000000:00000001:17.0:1698240468.758488:0:91492:0:(mgs_llog.c:572:mgs_find_or_make_fsdb()) Process entered
> 
> 20000000:00000001:17.0:1698240468.758489:0:91492:0:(mgs_llog.c:551:mgs_find_or_make_fsdb_nolock()) Process entered
> 
> 20000000:00000001:17.0:1698240468.758489:0:91492:0:(mgs_llog.c:565:mgs_find_or_make_fsdb_nolock()) Process leaving (rc=0 : 0 : 0)
> 
> 20000000:00000001:17.0:1698240468.758489:0:91492:0:(mgs_llog.c:578:mgs_find_or_make_fsdb()) Process leaving (rc=0 : 0 : 0)
> 
> 20000000:02020000:17.0:1698240468.758490:0:91492:0:(mgs_llog.c:711:mgs_set_index()) 140-5: Server fs1-OST0021 requested index 33, but that index is already in use. Use --writeconf to force
> 
> 20000000:00000001:17.0:1698240468.772355:0:91492:0:(mgs_llog.c:712:mgs_set_index()) Process leaving via out_up (rc=18446744073709551518 : -98 : 0xffffffffffffff9e)
> 
> 20000000:00000001:17.0:1698240468.772356:0:91492:0:(mgs_llog.c:4408:mgs_write_log_target()) Process leaving (rc=18446744073709551518 : -98 : ffffffffffffff9e)
> 
> 20000000:00020000:17.0:1698240468.772357:0:91492:0:(mgs_handler.c:503:mgs_target_reg()) Failed to write fs1-OST0021 log (-98)
> 
> 20000000:00000001:17.0:1698240468.783747:0:91492:0:(mgs_handler.c:504:mgs_target_reg()) Process leaving via out (rc=18446744073709551518 : -98 : 0xffffffffffffff9e)
> 
> 
> 
> 
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


More information about the lustre-discuss mailing list