[lustre-discuss] Bandwidth bottleneck at socket?

Brian Andrus toomuchit at gmail.com
Wed Aug 30 09:16:08 PDT 2017


All,

I've been doing some various performance tests on a small lustre 
filesystem and there seems to be a consistent bottleneck of ~700MB/s per 
socket involved.

We have 6 servers with 2 Intel E5-2695 chips in each.

3 servers are clients, 1 is MGS and 2 are OSSes with 1 OST each. 
Everything is connected with 40Gb Ethernet.

When I write to a single stripe, the best throughput I see is about 
1.5GB/s. That doubles if I write to a file that has 2 stripes.

If I do a parallel copy (using mpiio) I can get 1.5GB/s from a single 
machine, whether I use 28 cores or 2 cores. If I only use 1, it goes 
down to ~700MB/s

Is there a bandwidth bottleneck that can occur at the socket level for a 
system? This really seems like it.


Brian Andrus



More information about the lustre-discuss mailing list