[lustre-devel] [LSF/MM/BPF TOPIC] [DRAFT] Lustre client upstreaming

Oleg Drokin green at whamcloud.com
Sun Jan 19 21:37:28 PST 2025


On Mon, 2025-01-20 at 04:38 +0000, Day, Timothy wrote:

> I think every organization that uses Lustre has a model similar to
> this. But I don't think this is uncommon for other subsystems. The
> various OFED flavors come to mind (I think MOFED was mentioned
> in another thread). Everything is ultimately rebased on the
> community version, AFAIK.

My understanding of mofed is they are in exactly the same boat we are
hoping to avoid: "remove whatever pitiful stuff there is in the linux
kernel and then plug in our own superior stuff"

> > Both DDN and HPE significantly diverge with new features and such.
> > There's also a (now mostly dormant) Fujitsu "FEFS" fork that they
> > got
> > tired of maintaining and tried to fold back in, but could not.
> > (also
> > Cray's secure data appliance that seems to have met a similar fate:
> > https://github.com/Cray/lustre-sda <
> > https://github.com/Cray/lustre-sda> )
> > 
> > 
> > Yes, maintenance burden consideration is always there of course, so
> > there's some coordination nowadays (like reserving feature flags
> > ahead
> > of time and such), but it's not outside of realm of possibility
> > that if
> > what's perceived as "tip of the community tree" becomes
> > inconvenient,
> > it'll be dropped.
> > In fact a similar thing happened to the staging lustre in the past
> > I
> > guess, only before it even became the perceived tip (for a variety
> > of
> > reasons).
> 
> Both DDN and HPE regularly contribute fixes/features back to the
> community
> branch from their respective production branches. HPE seems to rebase
> their branches fairly often on community Lustre [1]. You would have
> more
> context if that's true for DDN - I couldn't find much online.

yes, there merges and rebases are relatively common for as long as it
remains convenient. At your Cray link you might also notice the rebases
are not on top of master.

> But Fujitsu and the SDA team in HPE were not contributing back as
> much and eventually abandoned their forks. So based on those
> examples,
> it seems most sustainable for organizations to contribute to the
> community
> release. So I think the risk of contributions being lessened because
> Lustre
> moves towards upstream is low, IMHO.
> 
> But I agree with your fundamental point - we can't make submitting
> patches
> to community Lustre arduous.

I skipped the preceding parts, but this is probably going to be the
main point of contention.
The reasons FEFS dropped out is because they did they development in
secret without talking to anyone making choices we (the "mainline"
Lustre people) found unwise or questionable.
So once Fujitsu came to us with "hey, we have this whole bunch of
awesome stuff", a lot of it had to be rejected because it was not done
in a good way or there was a competing implementation.

Now the tables are turning as I explained. We are doing development "in
secret" (as far as kernel maintainers are concerned, anyway).
> 
> [1] https://github.com/Cray/lustre
> 
> > > > Sure, only one of those trees is considered "community Lustre",
> > > > but
> > > > if
> > > > it will detach too much from what majority of developers really
> > > > runs
> > > > and gets paid to do - the "community Lustre" contributions
> > > > probably
> > > > would diminish greatly, I am afraid.
> > > 
> > > As long as the community Lustre development process is sane, I
> > > think
> > > most organizations will opt to continue deriving their releases
> > > from
> > > it and opt to continue contributing releases upstream. We just
> > > need
> > > to make sure we get buy-in from the people contributing to
> > > Lustre.
> > 
> > 
> > Well, there's another half of it, the kernel side. Previous run in
> > with
> > other kernel maintainers had left a bit of a sour taste in people's
> > mouths.
> > Of course they have their own reasons to dictate whatever they want
> > to
> > newcomers (And all coming patches), but on the other hand Lustre is
> > a
> > mature product that could not just drop everything and rewrite
> > significant chunks of the code (several times at that) o better
> > align
> > with the ever changed demands (bcachefs I think was a highly
> > paraded
> > around example of that, and they could accommodate those often
> > conflicting demands because not many deployments in the wild).
> > I don't know how possible is it to overcome. Kernel maintainers
> > don't
> > really care about Lustre (and rightfully so, we are but a blip to
> > them)
> > and then we also have our own priorities.
> 
> LSF/MM could be a good opportunity to improve our
> relationship with the upstream maintainers. :)

Absolutely. Though we did go there in the past, and had the discussions
and all, and there's no incentive for the kernel maintainers to accept
our way because obviously for them it's a bad process (and I don't
blame them!)

> > And while for Lustre developers there's a benefit of "the adjusting
> > to
> > new interfaces comes for free", there's no benefit to the kernel
> > maintainers, so they don't have much incentive.
> > (and again we saw this in the previous attempt)
> > 
> > 
> > And even imagine by some magic the actual inclusion and all the
> > relevant rework happened. Now HPE or DDN wants to add a new
> > feature,
> > they implement it and then submit and a met with the usual "now
> > rework
> > it in these other ways" demands.
> > Of course again from the kernel maintainers perspective this is
> > entirely reasonable and it's not their problem the development
> > process
> > is wrong and backwards and instead of developing everything in the
> > open
> > on the public branch with input from all parties interested there's
> > this closed development going on. But good luck convincing
> > respective
> > management of those companies to agree.
> 
> Backporting from production branches to the community release
> already takes some work. Especially in the feature is based on an
> older LTS. So I don't think porting to upstream Linux would be a huge
> amount of extra work.

Depends on how much extra friction the kernel acceptance adds (from
extra reviews by fs/mm maintainers) and I estimate it to be high
initially.
Don't forget all the "proprietary" features that are not in mainline,
but are fully developed otherwise. How many of those implementation
details would not be liked by the kernel maintainers is a big unknown.

> On the other hand, if Lustre was included in mainline properly rather
> than in staging - I think we’d have more leverage to implement things
> the way we want to. After all, the kernel maintainers don't really
> care
> about Lustre. :)

They care about the way interfaces are used, that was a pretty big
point of contention in the past and I sure still remains.
Have you seen all the "nice" string matching/userspace memory parsing
we do for the jobid determination for example? Yeah, I don't like it
either.
But they also don't like other things too.
Definitely talk to hch, he has choice words about many parts of Lustre
(if he still did not forget).

I really want to be optimistic about it, but I also still remember the
previous attempt vividly and majority of objections raised back then
are still pretty valid.




More information about the lustre-devel mailing list