[lustre-devel] RFC: Spill device for Lustre OSD
Jinshan Xiong
jinshanx at google.com
Tue Nov 4 16:30:53 PST 2025
On Tue, Nov 4, 2025 at 4:25 PM Andreas Dilger <adilger at dilger.ca> wrote:
>
> On Nov 4, 2025, at 4:54 PM, Jinshan Xiong <jinshanx at google.com> wrote:
> > On Tue, Nov 4, 2025 at 3:48 PM Andreas Dilger <adilger at dilger.ca> wrote:
> >>
> >> If the overhead of a local Lustre mount on the OSS is problematic, that
> >> seems like something which could/should be fixed? The local mounts are
> >> already "non-recoverable" so that they do not get an entry in last_rcvd
> >> and their absence does not cause any recovery issues.
> >>
> >> The main issue we've seen with local mountpoints is that this can
> confuse
> >> HA and prevent Lustre module unloading if they are not taken into
> account
> >> during cleanup.
> >
> > You're right. That's actually why we didn't do it in the first place. If
> an OSS crashes, it will definitely lead to recovery timeout and client
> eviction.
>
> If an OSS crash with a local mountpoint is causing recovery timeouts,
> then that seems like a bug to be fixed. This was changed by Alex Z.
> in patch "LU-12722 target: disable recovery for local clients" back in
> commit v2_13_52-45-g8bd04b4e57.
>
Thanks for the context. My knowledge of Lustre is many years old ;-)
>
> Cheers, Andreas
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20251104/7e3fb4b4/attachment.htm>
More information about the lustre-devel
mailing list