[lustre-devel] [LSF/MM/BPF TOPIC] [DRAFT] Lustre client upstreaming
Day, Timothy
timday at amazon.com
Wed Jan 22 09:17:12 PST 2025
> On 1/22/25, 6:14 AM, "Alexey Lyahkov" <alexey.lyashkov at gmail.com <mailto:alexey.lyashkov at gmail.com>> wrote:
>
> Timothy,
>
> > 22 янв. 2025 г., в 09:35, Day, Timothy <timday at amazon.com <mailto:timday at amazon.com>> написал(а):
> >
> > I've created a second draft of the topic for LSF/MM. I tried
> > to include everyone's feedback. It's at the end of the email.
> >
> > Before that, I wanted to elaborate on Neil's idea about updating
> > our development model to an upstream-focused model. For upstreaming
> > to work, the normal development flow has to generate patches to mainline
> > Linux - while still supporting the distro kernels that most people use
> > to run Lustre. I think we can get this point in stages. I've provided
> > a high-level overview in the next section. This won't be without
> > challenges - but the majority of the transition could happen without
> > interrupting feature work or normal development.
> >
>
> Can you explain how Lustre platform fragmentation will avoid ?
>
>
> I posted example early,
> Distro have locked a Lustre version in release time. But Lustre server have a limited compatibility - in most cases +/- 1…2 releases guaratee to be connected. So stale and aged client will live in the distribution kernel. And it will don’t work for modern servers.
> it’s very easy Once distribution live time ~8y. So clients will be needs to drop in kernel lustre client support and install a lustre client from an external sources. Which have no differences with current state.
> Next step is sort of distributions which have a different lustre versions which not compatible each to other.
> Both these increase a support cost - once large number versions needs supported, so development will drops and all time will spent to support.
I think that's a reasonable concern. I spend a lot of time doing customer
support for Lustre; I definitely don't want to make that part of my job any
harder than it has to be.
I'm my personal experience, I've seen 2.10 and 2.15 interoperate well together.
That covers a gap of around ~6 years at least. If someone stuck with RHEL7, the
first client they could use is 2.7.0 and the last client they could use is 2.16.0 [1].
So if a customer didn't update either their distro or filesystem, they could use an
up-to-date Lustre version for around 10 years covering 9 versions. So I think these
large version gaps are possible today.
There is an issue if distros don't want to update their clients. That's why we'll
still support running latest Lustre on older distros. Specifically, it'll be the Lustre
code from a mainline kernel combined with our lustre_compat/ compatibility
code. So normal Lustre releases will be derived directly from the in-tree kernel
code. This provides a path for vendors to deploy bug fixes, custom features, and
allows users to optionally run the latest and greatest Lustre code.
[1] Lustre changelog: https://git.whamcloud.com/?p=fs/lustre-release.git;a=blob_plain;f=lustre/ChangeLog;hb=HEAD
> It this is not enough - lets one more. Kernel API isn’t stable enough - so large number resources will be needs spent to solve each kernel change in lustre. Currently, it’s in the background and don’t interrupt primary work for supporting and development a new Lustre features.
>
> So that is problems for Lustre world - what is benefits?
By upstreaming Lustre, we'll benefit from developers updating the kernel
API "for free". We Lustre was in staging/, there wasn't as much obligation
to keep Lustre in a working state. But if we get Lustre merged properly,
developer will not be able to merge changes that break Lustre. So we'll
get support for the latest and greatest kernels with less effort. That's one
of the main benefits of this effort.
We also get benefit from more say over the future of the kernel. A lot
of difficulty with updating Lustre for new kernels comes when upstream
kernel developers lock down symbols or features to in-tree modules. This
could get even worse in the future, with stuff like symbol namespaces get
more use [2].
Even if most users use the out-of-tree backported-from-mainline-Linux
Lustre release, I think we'll still be in a stronger position after
upstreaming.
[2] https://lwn.net/Articles/760045/
>
> Alex
>
Tim Day
More information about the lustre-devel
mailing list