[lustre-discuss] OST is not mounting
Thomas Roth
t.roth at gsi.de
Mon Nov 13 04:34:28 PST 2023
So, did you do the "writeconf"? And the OST mounted afterwards?
As I understand, the MGS was under the impression that this re-mounting
OST was actually a new one using an old index.
So, what made your repaired OST look new/different ?
I would probably have mounted it locally, as an ext4 file system, if
only to check that there is data still present (ok, "df" would do that,
too).
"tunefs.lustre --dryrun" will show other quantum numbers that _should
not_ change when taking down and remounting an OST.
And since "writeconf" has to be done on all targets, you have to take
down your MDS anyhow - so nothing is lost by simply trying an MDS restart?
Regards
Thomas
On 11/5/23 17:11, Backer via lustre-discuss wrote:
> Hi,
>
> I am new to this email list. Looking to get some help on why an OST is
> not getting mounted.
>
>
> The cluster was running healthy and the OST experienced an issue and
> Linux re-mounted the OST read only. After fixing the issue and rebooting
> the node multiple times, it wouldn't mount.
>
> When the mount is done, the mount command errors out stating that that
> the index is already in use. The index for the device is 33. There is
> no place where this index is mounted.
>
> The debug message from the MGS during the mount is attached at the end
> of this email. It is asking to use writeconf. After using writeconfig,
> the device was mounted. Looking for a couple of things here.
>
> - I am hoping that the writeconf method is the right thing to do here.
> - Why did OST become in this state after the write failure and was
> mounted RO. The write error was due to iSCSI target going offline and
> coming back after a few seconds later.
>
> 20000000:01000000:17.0:1698240468.758487:0:91492:0:(mgs_handler.c:496:mgs_target_reg()) updating fs1-OST0021, index=33
>
> 20000000:00000001:17.0:1698240468.758488:0:91492:0:(mgs_llog.c:4403:mgs_write_log_target()) Process entered
>
> 20000000:00000001:17.0:1698240468.758488:0:91492:0:(mgs_llog.c:671:mgs_set_index()) Process entered
>
> 20000000:00000001:17.0:1698240468.758488:0:91492:0:(mgs_llog.c:572:mgs_find_or_make_fsdb()) Process entered
>
> 20000000:00000001:17.0:1698240468.758489:0:91492:0:(mgs_llog.c:551:mgs_find_or_make_fsdb_nolock()) Process entered
>
> 20000000:00000001:17.0:1698240468.758489:0:91492:0:(mgs_llog.c:565:mgs_find_or_make_fsdb_nolock()) Process leaving (rc=0 : 0 : 0)
>
> 20000000:00000001:17.0:1698240468.758489:0:91492:0:(mgs_llog.c:578:mgs_find_or_make_fsdb()) Process leaving (rc=0 : 0 : 0)
>
> 20000000:02020000:17.0:1698240468.758490:0:91492:0:(mgs_llog.c:711:mgs_set_index()) 140-5: Server fs1-OST0021 requested index 33, but that index is already in use. Use --writeconf to force
>
> 20000000:00000001:17.0:1698240468.772355:0:91492:0:(mgs_llog.c:712:mgs_set_index()) Process leaving via out_up (rc=18446744073709551518 : -98 : 0xffffffffffffff9e)
>
> 20000000:00000001:17.0:1698240468.772356:0:91492:0:(mgs_llog.c:4408:mgs_write_log_target()) Process leaving (rc=18446744073709551518 : -98 : ffffffffffffff9e)
>
> 20000000:00020000:17.0:1698240468.772357:0:91492:0:(mgs_handler.c:503:mgs_target_reg()) Failed to write fs1-OST0021 log (-98)
>
> 20000000:00000001:17.0:1698240468.783747:0:91492:0:(mgs_handler.c:504:mgs_target_reg()) Process leaving via out (rc=18446744073709551518 : -98 : 0xffffffffffffff9e)
>
>
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
More information about the lustre-discuss
mailing list