[lustre-devel] [LSF/MM/BPF TOPIC] [DRAFT] Lustre client upstreaming

Fri Jan 17 16:45:17 PST 2025

On Fri, 17 Jan 2025, Day, Timothy wrote:
> The following is a draft topic for the upcoming LSF/MM conference.
> 
> I wanted to solicit feedback from the wider Lustre development
> 
> community before submitting this to fsdevel. If I’ve omitted anything,
> 
> something doesn’t seem right, or you know of something that strengthens
> 
> the argument, please let me know!
> 
>  
> 
> ----------------------------------------------------
> 
>  
> 
> Lustre is a high-performance parallel filesystem used for HPC and AI/ML
> 
> compute clusters available under GPLv2. Lustre has achieved widespread
> 
> adoption in the HPC and AI/ML and is commercially supported by numerous
> 
> vendors and cloud service providers [1].
> 
>  
> 
> After 21 years and an ill-fated stint in staging, Lustre is still
> maintained as
> 
> an out-of-tree module [6]. The previous upstreaming effort suffered
> from a
> 
> lack of developer focus and user adoption, which eventually led to
> Lustre
> 
> being removed from staging altogether [2].
> 
>  
> 
> However, the work to improve Lustre has not stopped. In the intervening
> 
> years, the code improvements that would preempt a return to mainline
> 
> have been steadily progressing. At least 25% of patches accepted for
> 
> Lustre 2.16 were related to the upstreaming effort [3]. And all of the
> 
> remaining work is in-flight [4][5]. Our eventual goal is to a get a
> minimal
> 
> TCP/IP-only Lustre client to an acceptable quality before submitting to
> 
> mainline.

"Go big, or go home"!!

If our eventual goal is not "Get lustre, both client and server, into
mainline linux with support for TCP/IP and infiniband transports (at
least)"
then we really shouldn't bother.

There is no formal, or even semi-formal, specification of the Lustre
protocol.  The lustre protocol is "what the code does" so it cannot work
to develop client and server separately like it can for, e.g., NFS.

The goal you describe is an interim goal.  A first step (from the
upstream community perspective).

> 
>  
> 
> I propose to discuss:
> 
>  
> 
> - Expectations for a new filesystem to be accepted to mainline
> 
> - Weaknesses in the previous upstreaming effort in staging
> 

I think we know at least one perspective on the weaknesses in the
previous upstreaming effort and we need to demonstrate that we will do
better. 

   https://lore.kernel.org/all/20180601091133.GA27521@kroah.com/

   There is a whole separate out-of-tree copy of this codebase where the
   developers work on it, and then random changes are thrown over the
   wall at staging at some later point in time.  This dual-tree
   development model has never worked, and the state of this codebase is
   proof of that.

We need to demonstrate a process for, and commitment to, moving away
from the dual-tree model.  We need patches to those parts of Lustre
that are upstream to land in upstream first (mostly).

That means we need the model for supporting older kernels to be completely
based on libcfs holding compatibility code with no kernel-version
#ifdefs in the code.

We need a strong separation between server and client so that we can
justify everything that goes upstream as being to support the client,
and when we add server support to that, it just adds files.  Possibly we
could patch a few files to add server support, but we need to maintain
those as patches, not as alternate versions of upstream files.

We need to quickly reach a point where a lustre release is:

 - a verbatim copy of relevant files from a chosen upstream release,
   or just a dependency on that kernel source.
 - a bunch of extra files that might one day go upstream: server code
   and LNet protocol code
 - a *few* patches to integrate that code
 - some number of patches which have since gone upstream - bugfixes etc.
 - libcfs which contains a compat layer for older kernels.
 - user-space code, documentation, test scripts, etc for which there
   is no expectation of upstreaming to linux kernel.

Maybe the question for LSF is : what is a sufficient demonstration of commitment?

The big question for us is : how are we going to transition our
infrastructure to this model?

It would be nice to have a timeline for getting the second and third
bullet points down to zero.  Obviously it would be aspirational at best,
but a list of steps could be useful.

Thanks,
NeilBrown

>  
> 
> Lustre has already received a plethora of feedback in the past. While
> much
> 
> of that has been addressed since - the kernel is a moving target.
> Several
> 
> filesystems have been merged (and removed) since Lustre left staging.
> We're
> 
> aiming to avoid the mistakes of the past and hope to address as many
> 
> concerns as possible before submitting for inclusion.
> 
>  
> 
> Thanks!
> 
>  
> 
> Timothy Day (Amazon Web Services - AWS)
> 
> James Simmons (Oak Ridge National Labs - ORNL)
> 
>  
> 
> [1] Lustre Community Update: https://youtu.be/BE--ySVQb2M?si=
> YMHitJfcE4ASWQcE&t=960
> 
> [2] Kicked out of staging: https://lwn.net/Articles/756565/
> 
> [3] ORNL, Aeon, SuSe, AWS, and more: https://youtu.be/BE--ySVQb2M?si=
> YMHitJfcE4ASWQcE&t=960
> 
> [4] LUG24 Upstreaming Update: https://www.depts.ttu.edu/hpcc/events/
> LUG24/slides/Day1/LUG_2024_Talk_02-Native_Linux_client_status.pdf
> 
> [5] Lustre Jira Upstream Progress: https://jira.whamcloud.com/browse/
> LU-12511
> 
> [6] Out-of-tree codebase: https://git.whamcloud.com/?p=fs/
> lustre-release.git;a=tree
> 
>  
> 
> 
> 
>