[lustre-discuss] Lustre over 10 Gb Ethernet with and without RDMA

Ben Evans bevans at cray.com
Fri Jun 19 08:24:28 PDT 2015


It’s faster in that you eliminate all the TCP overhead and latency. (something on the order of 20% improvement in speed, IIRC, it’s been several years)

Balancing your network performance with what your disks can provide is a whole other level of system design and implementation.  You can stack enough disks or SSDs behind a server so that the network is your bottleneck, you can stack up enough network to few enough disks so that the drives are your bottleneck.  You can stack up enough of both so that the PCIE bus is your bottleneck.

Take the time and compare costs/performance to Infiniband, since most systems have a dedicated client/server network, you might as well go as fast as you can.

-Ben Evans

From: igko50 at gmail.com [mailto:igko50 at gmail.com] On Behalf Of INKozin
Sent: Friday, June 19, 2015 11:10 AM
To: Ben Evans
Cc: lustre-discuss at lists.lustre.org
Subject: Re: [lustre-discuss] Lustre over 10 Gb Ethernet with and without RDMA

Ben, is it possible to quantify "faster"?
Understandably, for a single client on an empty cluster it may feel "faster" but on a busy cluster with many reads and writes in flight I'd have thought the limiting factor is the back end's throughput rather than the network, no? As long as the bandwidth to a client is somewhat higher than the average i/o bandwidth (back end's throughput divided by the number of clients) the client should be content.

On 19 June 2015 at 14:46, Ben Evans <bevans at cray.com<mailto:bevans at cray.com>> wrote:
It is faster, but I don’t know what price/performance tradeoff is, as I only used it as an engineer.

As an alternative, take a look at RoCE, it does much the same thing but uses normal (?) hardware.  It’s still pretty new, though, so you might have some speedbumps.

-Ben Evans

From: lustre-discuss [mailto:lustre-discuss-bounces at lists.lustre.org<mailto:lustre-discuss-bounces at lists.lustre.org>] On Behalf Of INKozin
Sent: Friday, June 19, 2015 5:43 AM
To: lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>
Subject: [lustre-discuss] Lustre over 10 Gb Ethernet with and without RDMA

My question is about performance advantages of Lustre RDMA over 10 Gb Ethernet. When using 10 Gb Ethernet to build Lustre, is it worth paying the premium for iWARP? I understand that iWARP essentially reduces latency but less sure of its specific implications for storage. Would it improve performance on small files? Any pointers to representative benchmarks will be very appreciated.

Celsio has released a white paper in which they compare Lustre RDMA over 40 Gb Ethernet and FDR IB
http://www.chelsio.com/wp-content/uploads/resources/Lustre-Over-iWARP-vs-IB-FDR.pdf
where they claim comparable performance of both.
How much worse the throughput on small block sizes would be without iWARP?

Thank you
Igor

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20150619/0c359bc6/attachment.htm>


More information about the lustre-discuss mailing list