[Lustre-discuss] Max bandwidth through a single 4xQDR IB link?
Bernd Schubert
bs_lists at aakef.fastmail.fm
Tue Jun 29 07:15:02 PDT 2010
Hello Ashley, hello Kevin,
I really see no point to use disks to benchmark performance, when
lnet_selftest exists. Benchmark order should be:
- test how much the disks can provide
- test network with lnet_selftest
=> make sure lustre performance is not much below the
min(disks, lnet_selftest)
Cheers,
Bernd
On Tuesday, June 29, 2010, Kevin Van Maren wrote:
> DAPL is a high-performance interface that uses a small shim to provide a
> common DMA API on top of (in this case) the IB verbs layer. In general,
> there is a very small performance impact to be able to use the common
> API, so you will not get more large-message bandwidth using native IB
> verbs.
>
> I've never had enough disk bandwidth behind a node to saturate a QDR IB
> link, so I'm not sure how high LNET will go. If you have an IB test
> cluster, you should be able to measure the upper limits by creating an
> OST on a loopback device on tmpfs, although you have to ensure the
> client-side cache is not skewing your results (hint: boot client with
> something like "mem=1g" to limit the ram they can use for the cache).
>
> While the QDR IB link bandwidth is 4GB/s (or around 3.9GB/s with 2KB
> packets), the maximum HCA bandwidth is normally around 3.2GB/s
> (unidirectional), due to the PCIe overhead of breaking the transaction
> into (relatively) small packets and managing the packet flow
> control/credits. This is independent of the protocol, and limited by
> the PCIe Gen2 x8 PCIe interface. You will see somewhat higher bandwidth
> if your system supports and uses a 256 byte MaxPayload, rather than 128
> bytes. Use lspci to see what your system is using, as in: "lspci -vv -d
> 15b3: | grep MaxPayload"
>
> Kevin
>
> Ashley Pittman wrote:
> > Hi,
> >
> > Could anyone confirm to me the maximum achievable bandwidth over a single
> > 4xQDR IB link into a OSS node. I have many clients doing a write test
> > over IB and want to know the maximum bandwidth we can expect to see for
> > each OSS node. For MPI over these links we see between 3 and 3.5BG/s
> > but I suspect Lustre is capable of more than this because it's not using
> > DALP, is this correct?
> >
> > Ashley.
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
Bernd Schubert
DataDirect Networks
More information about the lustre-discuss
mailing list