[lustre-devel] [LSF/MM/BPF TOPIC] [DRAFT] Lustre client upstreaming
Oleg Drokin
green at whamcloud.com
Tue Feb 4 10:38:11 PST 2025
On Tue, 2025-02-04 at 17:33 +0000, Andreas Dilger wrote:
> You overlook that Tim works for AWS, so he would not actually pay to
> run these nodes. He could run in machine idle times while no external
> customer is paying for them.
If this could be arranged that would be great of course, but I don't
want to assume something of this nature unless explicitly stated. And
who knows what sort of internal accounting there might be in place to
keep track (and approve) uses like this too.
> I suspect with the random nature of the boilpot that it is the total
> number of hours runtime that matter, not whether they are contiguous
> or not. So running 24x boilpot nodes for 1h during off-peak times
> would likely produce the same result as 24h continuous on one node.
Well, not exactly true. There need to be continuous chunks of at least
1x the longest testrun and preferably much more (2x is better as the
minimum?).
If conf-sanity takes 5 hours in this setup (cpu overcommit making
things slow and whatnot) and you always only run for an hour - we never
get to try most of conf-sanity.
Also 50 sessions of conf-sanity running in parallel 1x vs
10 sessions running conf-sanity in parallel 5x - the latter probably
wins coverage wise because over time the other conflicting VMs would
deviate more so the stress points in the code would fall more and more
differently, I suspect (but we can probably test this by running both
setups for long enough in parallel on the same code and see how much of
a crash rate difference it makes)
>
> Cheers, Andreas
>
> > On Feb 3, 2025, at 15:30, Oleg Drokin <green at whamcloud.com> wrote:
> >
> > On Mon, 2025-02-03 at 20:24 +0000, Oleg Drokin wrote:
> >
> > > at $11/hour the m7a.metal-48xl would take $264 to run for just
> > > one
> > > day,
> > > a week is an eye-watering $1848, so running this for every patch
> > > is
> > > not
> > > super economical I'd say.
> >
> > x2gd metal at $5.34 per hour makes more sense as it has more RAM
> > (and
> > 64 CPUs is adequate I'd say) but still quite pricey if you want to
> > run
> > this at any sort of scale.
> > _______________________________________________
> > lustre-devel mailing list
> > lustre-devel at lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
More information about the lustre-devel
mailing list