[lustre-discuss] OST is not mounting

James Lam unison2004 at hotmail.com
Tue Nov 7 08:42:50 PST 2023


If possible do the hexdump to see if any problems of the desired OST

https://groups.google.com/g/lustre-discuss-list/c/3cmmcKAB34w

If the OST is in ldiskfs , do the e2fsck for the lowest level ldiskfs check to see if any problem , remember , dry run first.

Regards,

James


________________________________
From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of Backer via lustre-discuss <lustre-discuss at lists.lustre.org>
Sent: Tuesday, November 7, 2023 2:19 PM
To: lustre-discuss at lists.lustre.org <lustre-discuss at lists.lustre.org>
Subject: Re: [lustre-discuss] OST is not mounting

Hi,

Sending this again. Appreciate your help.

On Sun, 5 Nov 2023 at 11:11, Backer <backer.kolo at gmail.com<mailto:backer.kolo at gmail.com>> wrote:
Hi,

I am new to this email list. Looking to get some help on why an OST is not getting mounted.


The cluster was running healthy and the OST experienced an issue and Linux re-mounted the OST read only. After fixing the issue and rebooting the node multiple times, it wouldn't mount.

When the mount is done, the mount command errors out stating that that the index is already in use. The index for the device is 33.  There is no place where this index is mounted.

The debug message from the MGS during the mount is attached at the end of this email. It is asking to use writeconf. After using writeconfig, the device was mounted. Looking for a couple of things here.

- I am hoping that the writeconf method is the right thing to do here.
- Why did OST become in this state after the write failure and was mounted RO.  The write error was due to iSCSI target going offline and coming back after a few seconds later.


20000000:01000000:17.0:1698240468.758487:0:91492:0:(mgs_handler.c:496:mgs_target_reg()) updating fs1-OST0021, index=33

20000000:00000001:17.0:1698240468.758488:0:91492:0:(mgs_llog.c:4403:mgs_write_log_target()) Process entered

20000000:00000001:17.0:1698240468.758488:0:91492:0:(mgs_llog.c:671:mgs_set_index()) Process entered

20000000:00000001:17.0:1698240468.758488:0:91492:0:(mgs_llog.c:572:mgs_find_or_make_fsdb()) Process entered

20000000:00000001:17.0:1698240468.758489:0:91492:0:(mgs_llog.c:551:mgs_find_or_make_fsdb_nolock()) Process entered

20000000:00000001:17.0:1698240468.758489:0:91492:0:(mgs_llog.c:565:mgs_find_or_make_fsdb_nolock()) Process leaving (rc=0 : 0 : 0)

20000000:00000001:17.0:1698240468.758489:0:91492:0:(mgs_llog.c:578:mgs_find_or_make_fsdb()) Process leaving (rc=0 : 0 : 0)

20000000:02020000:17.0:1698240468.758490:0:91492:0:(mgs_llog.c:711:mgs_set_index()) 140-5: Server fs1-OST0021 requested index 33, but that index is already in use. Use --writeconf to force

20000000:00000001:17.0:1698240468.772355:0:91492:0:(mgs_llog.c:712:mgs_set_index()) Process leaving via out_up (rc=18446744073709551518 : -98 : 0xffffffffffffff9e)

20000000:00000001:17.0:1698240468.772356:0:91492:0:(mgs_llog.c:4408:mgs_write_log_target()) Process leaving (rc=18446744073709551518 : -98 : ffffffffffffff9e)

20000000:00020000:17.0:1698240468.772357:0:91492:0:(mgs_handler.c:503:mgs_target_reg()) Failed to write fs1-OST0021 log (-98)

20000000:00000001:17.0:1698240468.783747:0:91492:0:(mgs_handler.c:504:mgs_target_reg()) Process leaving via out (rc=18446744073709551518 : -98 : 0xffffffffffffff9e)


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20231107/3b47a014/attachment-0001.htm>


More information about the lustre-discuss mailing list