[Lustre-discuss] Lustre with 10GbE or Infiniband?

Scott Atchley atchley at myri.com
Wed Feb 11 13:35:47 PST 2009


On Feb 11, 2009, at 2:25 PM, Brian J. Murrell wrote:

> On Wed, 2009-02-11 at 11:08 -0800, Jeffrey Bennett wrote:
>> Hi,
>>
>> Has anybody done any performance comparison between Lustre with  
>> 10GbE and Lustre with Infiniband 4X SDR? I wonder if they perform  
>> similarly.
>
> While I don't have any performance numbers or experience for you, I  
> will
> mention the differences in the way Lustre uses those two technologies.
>
> On 10GbE, Lustre (via it's sock LND) will use the TCP/IP stack on  
> top of
> the ethernet stack.  With Infiniband, we communicate directly with the
> I/B stack (via the o2ib LND) and take direct advantage of it's RDMA
> capabilities to achieve a very high percentage of wire speed.
>
> My gut feeling is that the overhead of TCP/IP carves some percentage  
> out
> of your ability to achieve full wire speed.
>
> Maybe some others here, including our benchmarking folks here at Sun  
> can
> provide some real world experiences and comparisons.
>
> b.

Jeffrey,

To add to Brian's comments, IB 4X SDR is limited to about 700-750 MB/s  
by the fabric. O2IBLND cannot go faster than minimum of either the  
fabric or PCI-E connection allow.

SOCKLND is limited by a copy on the receive side. When a client  
writes, the server has to copy the data out. When a client reads, it  
has to copy the data out. Because of this from a server's point-of- 
view, multiple client read performance can scale with the number of  
clients (the server is sending with zero-copy to multiple clients) and  
can reach linerate. I did some tests a couple of years ago with  
SOCKLND and our NICs:

http://wiki.lustre.org/index.php?title=Myri-10G_Ethernet

It shows a single server with 1 and 3 clients reading and writing.  
When 3 clients read, it got very close to linerate.

Indiana University won the SC07 Bandwidth Challenge using Lustre over  
the wide-area. They used SOCKLND with Myricom NICS and top-of-the-line  
DDN storage. They saturated a 10 Gb/s link (sending and receiving  
simultaneously), but I think it took a couple of DDN systems and  
corresponding OSSes.

If your storage cannot exceed 700-750 MB/s, then either should work  
for you.

Scott



More information about the lustre-discuss mailing list