[lustre-devel] [LSF/MM/BPF TOPIC] [DRAFT] Lustre client upstreaming

Sat Feb 1 15:25:32 PST 2025

On Sun, 2025-02-02 at 09:19 +1100, NeilBrown wrote:
> On Sat, 25 Jan 2025, Oleg Drokin wrote:
> > On Sat, 2025-01-25 at 10:12 +1100, NeilBrown wrote:
> > > On Mon, 20 Jan 2025, Oleg Drokin wrote:
> > > > On Sun, 2025-01-19 at 09:48 +1100, NeilBrown wrote:
> > > > > 
> > > > > Once the transition completes there will still be process
> > > > > difficulties,
> > > > > but there are plenty of of process difficulties now (gerrit:
> > > > > how
> > > > > do I
> > > > > hate thee, let me count the ways...) but people seem to
> > > > > simply
> > > > > include
> > > > > that in the cost of doing business.
> > > > 
> > > > it's been awhile since I did patch reviews by emails, but I
> > > > think
> > > > gerrit is much more user-friendly (if you have internet,
> > > > anyway)
> > > 
> > > I guess it isn't exactly the gerrit interface but more the
> > > workflow
> > > that
> > > it encourages, or at least enables.
> > > The current workflow seems to be "patch at a time" rather than
> > > "patchset
> > > at a time".
> > > The fact that you cherry-pick patches into master is completely
> > > different to how most (all?) of the upstream community works.  It
> > > means
> > > that whole series isn't visible in the final git tree so we lose
> > > context.  And it seems to mean that a long series takes a
> > > loooooong
> > > time
> > > to land as it dribbles in to master.
> > 
> > In fact the whole series is visible in gerrit if you submit it as
> > such.
> 
> True, but not as helpful as I might like.  I cannot see a way to add
> someone as a reviewer for a whole series, and there is no way for
> them
> to then give a +1 for the whole series.  These are trivial actions
> when
> using unstructured email.

But on the other hand it's a bit easier to assume they have seen all
the patches at least superficially instead of rubbestamping the whole
series for whatever reason (I remember Linus was comlaining of people
just rubberstamping rebased series that ovbiously did not work)

> > But as you noted later the testing is unfortunately much les
> > reliable
> > than we want, and because we only land things in order they are
> > submitted in the series - if you bunch unrelated stuff all together
> > suddenly it might take much longer to land.
> 
> Certainly only land them in the order they appear in the series, but
> my
> memory of my experience is that even when they are related they don't
> all land in master at once.  This might be due to the constant fight
> against false positives with testing, and the need to get people to
> re-review every patch when a small update is needed in an earlier
> patch.
> But it seems to be more than that.

Basically what could happen is you submit a long series of say 20
patches. Patch 7 fails testing but first 6 patches are fine.
So the "list of patches ready to land" would only show the first 6 and
they might get landed (sure it's not the whole series which might be
less ideal, but as we ensure nothing individually breaks things - not
fatal?)

> > > I would MUCH rather that a whole series was accepted or rejected
> > > as a
> > > whole - and was merged rather than cherry-picked to keep commit
> > > ids
> > > stable.
> > 
> > Gerrit has such a mode, but we decided it does not work well for us
> > for
> > a variety of reasons.
> 
> I wonder if it might be time to review those reasons, particularly as
> we
> need to review a lot of process if we are to move toward working
> upstream.
> We don't *have* to follow what other subsystems do (and they aren't
> all
> the same), but we would want to have clear reasons, understood by
> all,
> for deviating significantly

One of the bigger problems I remember was merges. You merge a series of
patches, and the merge commit itself is not empty, but might contain
"invisible" unreviewed code.
The other is all those automatically added commit bits - where the
patch was reviewed, by whom, ... you cannot do this with a merge
commit.

> > > There are times when I would like the first few patches of a
> > > series
> > > to
> > > land earlier, but that should be up to the submitter to split the
> > > series. 
> > 
> > But you cannot if these later patches do depend on the earlier
> > ones?
> 
> Only because gerrit cannot cope.
> Using email, I might submit a set of 10 patches, get some discussion,
> decide that the first 5 really are good-to-go but the later ones need
> some work. So I resubmit the first 5.  They can easily be applied by
> the
> maintainer.

This already works perfectly and in fact even in unintended ways as you
noted earlier.

> With gerrit I could "abandon" the latter 5 but I might not want to do
> that - I might want to revise.  But probably abandoning would be ok.
> Still not quite as easy as simply not resubmitting them.

I think it's the other way around that requires resubmitting - when you
abandon first 5 patches. (you don't have to resubmit the latter 5
patches, but there needs to be some way of communicating it to the
gatekeeper that it's ok these patches have unmet dependencies because
the automatic tooling does not have this knowledge (and the tooling is
not even in gerrit, it's homegrown stuff, gerrit would happily let oyu
lend out of order patches even when they totally break everything as
long as they clearly merge)

> > > And the automatic testing is a real pane.  Certainly it is
> > > valuable
> > > but
> > > it has a real cost too.  The false positives are a major pane.  I
> > > would
> > > rather any test that wasn't reliable were disabled (of fixed) as
> > > a
> > > priority.  Or at least made non-fatal.
> > > Also, it would be much nicer if the last in a series were tested
> > > first
> > > and if that failed then don't wasted resources testing all the
> > > others.
> > > Bonus points for a "bisect" to find where the failure starts, but
> > > that
> > > can be up to the develop to explicitly request testing at some
> > > points
> > > in
> > > the series.
> > 
> > Yes,  I agree there's much to be improved testing-wise. I did not
> > come
> > up with my own parallel testing system because I was happy with the
> > default one after all.
> > And then my own system deteriorated (At least it does not set -1s
> > left
> > and right, though that means people totally ignore those results
> > too)
> > 
> 
> I think your parallel system that adds comments to gerrit is
> sometimes
> very helpful.  What bothers me is that you have another test setup
> that
> you run on master-next before and won't land things until that
> passes.

That's integration testing to make sure all patches work nice together,
that's why they whole -next thing exists in Linux as well.

Now I do have an additional testsuite for master-next - the boilpot.
I fully realize rejecting things late is super expensive on everybody
and that's why all the things I could do automatic feedback on
individual patches super early on (the gerrit janitor and such).

But the boilpot is expensive. It takes days to weeks of intensive load
testing. While I'd love to subject every patch to this stress testing,
I only have one (now two) systems capable of this.

> It means that I can pass all the auto testing and get all the reviews
> and still there is a question mark over if/when it will land.  For me
> that created a sense of helplessness that was quite demotivating.
> Obviously it is good to find bugs, and those bugs need to be fixed,
> but
> they don't need to prevent landing.

Landing known buggy code sounds counter-productive. It increases the
number of failed tests and that makes new breakage harder to see and
hides some further breakage that would happen later in the already
broken codepath but ended up not being reached due to an earlier known
problem.

The other problem (you also see it spades with static code analysis) is
once the buggy problem has landed - the motivation on the part of
original developer suddenly vanes as they have other things to move to
now (not always the case, of course, but a very real concern).
Sure if there's no action for some time the patch could be reverted,
but that's even more expensive esp. as there could be other patches
building up not top preventing clear reverts and such.

> I would feel more motivated if there were a clear process that was
> followed most of the time whereby once a series passed auto-testing
> and
> had sufficient reviews it would be expected to land e.g.  the next
> Monday.  Maybe it doesn't land in master, maybe in something else
> that
> gets merged after a stabilisation window, but it should land.

that's master-next, right? Granted we don't do specific stabilization
there and I do not have a good solution for predictable boilpot runs.

If you want to avoid surprises for your patches, I can publish my
boilpot scripts and you can run your own instance if you have the
hardware.
Or we can find some sponsors to have some sort of a shared public
instance where people could drop their patches to?

> If asynchronous testing then reports a problem that needs to be
> addressed.  Worst-case the patch might be reverted but that would be
> rare.  Often a small fix will make the problem go away.

The "small fix" better be reviewed in the full context though. Sure,
usually it's fine, but usually all patches are fine, the kind of
problems the boilpot finds are convoluted though.