[lustre-devel] [LSF/MM/BPF TOPIC] [DRAFT] Lustre client upstreaming

Day, Timothy timday at amazon.com
Mon Feb 3 12:10:53 PST 2025


> Unfortunately cloud is not very conductive to the way boilpot operates,
> the whole idea is to instantiate a gazillion of virtual machines that
> are run on a single physical host to overcommit the cpu (a lot!)
>
> so I have this 2T RAM AMD box and I instantiate 240 virtual machines on
> it, each gets 15G RAM and 15 CPU cores (this is the important part, if
> you do not have cpu overcommit, nothing works)

You can do a similar thing in the cloud with bare metal instances. Normally,
you can't do nested virtualization (i.e. QEMU/KVM inside EC2). But a bare
metal instance avoids that issue. That's how I run ktest [1], which uses
QEMU/KVM. Something like m7a.metal-48xl has 192 CPU and 768G of
memory, so similar to the size you mention. What ratio of overcommit
do you have? For RAM, it seems to be 2:1. What about for CPU?

Tim Day

[1] https://github.com/koverstreet/ktest/tree/master



More information about the lustre-devel mailing list