[lustre-devel] [LSF/MM/BPF TOPIC] [DRAFT] Lustre client upstreaming
Oleg Drokin
green at whamcloud.com
Sun Jan 19 12:46:12 PST 2025
On Sat, 2025-01-18 at 21:46 +0000, Day, Timothy wrote:
>
>
> > On 1/17/25, 10:17 PM, "Oleg Drokin"
> > <green at whamcloud.com <mailto:green at whamcloud.com>> wrote:
> > > On Sat, 2025-01-18 at 11:45 +1100, NeilBrown wrote:
> > > We need to demonstrate a process for, and commitment to, moving
> > > away
> > > from the dual-tree model. We need patches to those parts of
> > > Lustre
> > > that are upstream to land in upstream first (mostly).
> >
> >
> > I think this is not very realistic.
> > Large chunk (100%?) of users do not run not only the latest kernel
> > release, they don't run the latest LTS either.
> >
> >
> > When we were in staging last this manifested in random patches
> > being
> > landed and breaking the client completely and nobody noticing for
> > months.
> >
> >
> > Of course some automatic infrastructure could be built up to make
> > it
> > somewhat better, but it does not remove the problem of "nobody
> > would
> > run this mainline tree", I am afraid.
>
> I think there's a decent chunk of users on newer kernels. Ubuntu
> 22/24 is
> on (a bit past latest) LTS 6.8 kernel [1], AL2023 is on previous LTS
> 6.1 [2], and
> working on upcoming LTS 6.12 [3].
Well, I mostly mean in context of Lustre client use and sure there's
some 6.8 LTS in use on those ubuntu clients, though I cannot assess the
real numbers, majority of reports I see are still on 5.x even on
Ubuntu.
> When a patch lands in lustre-release/master, it could be around 1 -
> 1.5 years
> before it lands in a proper Lustre release. At that point, it might
> see real
> production usage.
Well, not really.
I guess it might not be seen as easily from the outside, but "lustre-
release/master" patches are backports from "true production" branches.
the number approaches 100% for features, but even a sizable number of
fixes are backports.
In particular anything that comes from HPE are backports, they run
their production stuff, sometimes hit problems, create fixes, and the
eventually determine that the problem is present in master as well (or
sometimes b2_x branches) and submit their ports there.
The actual lag between features being developed and then getting into
the master branch could be rather long too.
> So I think it's mostly a matter of convincing people to use an
> upstream
> client. I don't think people wanted to use the staging client because
> it
> didn't work well and wasn't stable. And vendors don't want to work on
> something that no one uses. It the client is "good enough" and people
> are confident it'll continue to be updated, I think they will use it.
> The
> staging client was neither of those things.
I agree once you convince people (both users and developers) to use the
upstream client things will move in this desirable direction, but right
now I don't know how to convince them.
on RHEL (and derivatives) front the time lag is huge in particular.
> > It does not hep that there are what 3? 4? trees, not "dual-tree" by
> > any
> > stretch of imagination.
> >
> >
> > There's DDN/whamcloud (that's really two trees), there's HPE, LLNL
> > keeps their fork still I think (thought it's mostly backports?).
> > There
> > are likely others I am less exposed to.
>
> I think most non-community Lustre release are derived from the
> community release and periodically rebased. I think AWS,
> Whamcloud, LLNL, Microsoft would fall into that bucket. And I
> doubt DDN and HPE significantly diverge from community Lustre. But
> if someone is diverging significantly from community Lustre, I think
> they are opting into a significant maintenance burden regardless of
> what we do with lustre-release/master.
Both DDN and HPE significantly diverge with new features and such.
There's also a (now mostly dormant) Fujitsu "FEFS" fork that they got
tired of maintaining and tried to fold back in, but could not. (also
Cray's secure data appliance that seems to have met a similar fate:
https://github.com/Cray/lustre-sda )
Yes, maintenance burden consideration is always there of course, so
there's some coordination nowadays (like reserving feature flags ahead
of time and such), but it's not outside of realm of possibility that if
what's perceived as "tip of the community tree" becomes inconvenient,
it'll be dropped.
In fact a similar thing happened to the staging lustre in the past I
guess, only before it even became the perceived tip (for a variety of
reasons).
> > Sure, only one of those trees is considered "community Lustre", but
> > if
> > it will detach too much from what majority of developers really
> > runs
> > and gets paid to do - the "community Lustre" contributions probably
> > would diminish greatly, I am afraid.
>
> As long as the community Lustre development process is sane, I think
> most organizations will opt to continue deriving their releases from
> it and opt to continue contributing releases upstream. We just need
> to make sure we get buy-in from the people contributing to Lustre.
Well, there's another half of it, the kernel side. Previous run in with
other kernel maintainers had left a bit of a sour taste in people's
mouths.
Of course they have their own reasons to dictate whatever they want to
newcomers (And all coming patches), but on the other hand Lustre is a
mature product that could not just drop everything and rewrite
significant chunks of the code (several times at that) o better align
with the ever changed demands (bcachefs I think was a highly paraded
around example of that, and they could accommodate those often
conflicting demands because not many deployments in the wild).
I don't know how possible is it to overcome. Kernel maintainers don't
really care about Lustre (and rightfully so, we are but a blip to them)
and then we also have our own priorities.
And while for Lustre developers there's a benefit of "the adjusting to
new interfaces comes for free", there's no benefit to the kernel
maintainers, so they don't have much incentive.
(and again we saw this in the previous attempt)
And even imagine by some magic the actual inclusion and all the
relevant rework happened. Now HPE or DDN wants to add a new feature,
they implement it and then submit and a met with the usual "now rework
it in these other ways" demands.
Of course again from the kernel maintainers perspective this is
entirely reasonable and it's not their problem the development process
is wrong and backwards and instead of developing everything in the open
on the public branch with input from all parties interested there's
this closed development going on. But good luck convincing respective
management of those companies to agree.
More information about the lustre-devel
mailing list