[Lustre-discuss] Recommended 10 GigE cards for Lustre?

Scott Atchley atchley at myri.com
Tue Apr 8 05:55:11 PDT 2008


On Apr 8, 2008, at 5:48 AM, Daire Byrne wrote:
> Klaus,
>
> We use Myricom 10G cards and have found them to be reliable and  
> good performers. They support the "MX" driver which is a lower  
> latency kernel bypass driver (which we don't use). Driver is in the  
> kernel too so no messing about when doing kernel releases.
>
> I have no idea how it stacks up against the competition but it  
> works for us.
>
> Daire

Hi Daire,

Thanks for the choosing Myricom. :-)

We do have an Ethernet driver in the kernel, but it lacks a few  
features that the kernel maintainers do not want (e.g. LRO on older  
kernels). RedHat chooses to stay with the in-kernel driver while SLES  
includes our full-featured driver.

Klaus,

I am aware of several sites that use our NICs in Ethernet mode with  
Lustre (using Lustre's SOCKLND). Many DOE labs use them as such, but  
the site that has probably gotten the most publicity is Indiana  
University. At SC06 and SC07, they ran Lustre over TCP over the WAN.  
They won the Bandwidth Challenge at SC07:

http://supercomputing.iu.edu/bandwidth_challenge.php

I have not tested Lustre with our NICs on the latest Intels, but I  
did some testing last year with Opterons and posted it at:

http://wiki.lustre.org/index.php?title=Myri-10G_Ethernet

Performance on Intels should be better than the Opterons with their  
higher memory bandwidth (SOCKLND requires a copy on receive). I know  
IU has seen much better performance with Woodcrests (I think in the 1  
GB/s range). They are also using the latest DDN racks.

Given all the above, rarely is the network the bottleneck. I believe  
it is most common that the storage is the bottleneck. If you already  
have your storage, you should benchmark that to determine how much  
interconnect performance you need.

Also, if you have Woodcrest or later Intels, the choice of Linux  
kernel will dramatically impact performance. On kernels 2.6.18 and  
lower, the kernel uses the wrong memory copy routine and kills  
performance. Ideally, you would run 2.6.19+ (I think SLES10 uses  
2.6.20).

And as Daire mentions, you can run Lustre on our cards using Myrinet  
Express (MX) and benefit from zero-copy bulk transfers and low latency.

Scott



>
>
>
> ----- "Klaus Steden" <klaus.steden at thomson.net> wrote:
>
>> Hello everyone,
>>
>> Are there NICs that are "officially" recommended for use with CFS?
>> From what
>> I've read, and based on my experience with Infiniband, 10 GigE would
>> be kind
>> of a waste without support for RDMA.
>>
>> The cards I've looked at are a mixed bag for RDMA support, but
>> according to
>> someone in another lab here, at least one card they've worked with
>> supports
>> RDMA with the Open Fabrics Infininband/iSER stack.
>>
>> Does anyone on the list have suggestions or first-hand experience (or
>> even
>> second-hand) with Lustre and 10 GigE?
>>
>> thanks in advance,
>> Klaus
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss




More information about the lustre-discuss mailing list