[lustre-discuss] Changing default recovery window time settings

Christian Kuntz c.kuntz at opendrives.com
Thu Aug 4 21:36:43 PDT 2022


Hello all,

I'm wondering if there is any way to tune the maximum amount of time that
lustre will use for a recovery window in the event that imperative recovery
fails due to the failover of an MGS. On MGS failover, we appear to hit a
default timeout of around 6 minutes that seems to be unavoidable. We're at
a scale of less than 10 total nodes, so it seems that this timeout could
safely be made much shorter.

I understand that I'm approaching an unsafe/risky situation and asking for
it to be made more unsafe, but we'd like to get start time in the event of
a total cluster failure as fast as possible (within reason, of course).
Alternatively, any way to manually end the recovery window would be
appreciated.

Cheers, and thanks for your attention,
Christian Kuntz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20220804/a764a271/attachment.htm>


More information about the lustre-discuss mailing list