[Lustre-discuss] configuration question 1.6.4; multiple NICs on OSS

Klaus Steden klaus.steden at thomson.net
Mon Mar 3 11:52:45 PST 2008


Hi Jim,

I use bonding in one of our configurations here (LACP-based, to an Extreme
Summit series switch), and the overhead is not bad. My best performance test
so far provided about 340-350 MB/s sustained read performance across two OSS
nodes, each with two GigE striped together using LACP for a total of 4 GigE
from the file system.

Single link performance with the same equipment was about 200 MB/s (a single
NIC on each OSS), so for me, the overhead of LACP is worth it, since the
overall performance goes up significantly. With the right switch, you can
get some pretty impressive results using plain ol' vanilla GigE.

However ... that's just a suggestion.

>From my experience, in order to do what I think you want to do ...

Each OSS would communicate on either eth0 or eth1, and thus its' LNET config
would look like this in /etc/modprobe.conf:

options lnet networks="tcp0(eth0),tcp1(eth1)"

On the client side, in order to take advantage of the split networking, your
LNET config would look like this in /etc/modprobe.conf:

options lnet networks="tcp0(eth0)"

or this:

options lnet networks="tcp1(eth1)"

since with what you're attempting, Lustre will push all its traffic over the
first available link in the case of multiple paths -- so if your clients
were able to choose between one or the other, you'd simply saturate the tcp0
path and nothing would really happen on the tcp1 path.

This gets to be a bit of a hassle to manage, as the administrator has to
take a hand in the load balancing aspect, determining which clients use
which LNET network. This can be handled relatively trivial with some modulo
arithmetic in a Kickstart file (where you'd generate the LNET entries your
client node would use), but really ... it's extra work and extra hassle.

Using bonding on the OSSes, you would see balanced usage of all the
participating NICs and respectable overall throughput, but you don't have to
fool around with multiple LNETs or IP subnetting.

That's just my two cents, and I'm happy to be proven wrong, but for my money
(and labour), it is easier to implement Lustre using a solid NIC bonding
framework than it was to attempt to split up multiple LNETs and keep it all
sorted in my head and on paper.

cheers,
Klaus

On 3/3/08 11:37 AM, "Jim Albin" <jim_albin at nrel.gov>did etch on stone
tablets:

> Hello,
>   We're trying to see if we can use multiple NIC's on a pair of OSS's
> without bonding. Trying to decipher the Multi-Home example in the
> Operations Manual 1.6_v1.10 Chapter 7 and I must be missing something. I
> have not attempted bonding yet, the manual seems to suggest you can use
> multiple NIC's without bonding and avoid the overhead of bonding. We're
> looking for either failover or load balancing advantages over a single
> NIC in the OSS.
> 
>  Could someone please post an example of a configuration similar to
> this:
> 
> mdt - eth0 only
> oss1,oss2 - eth0 & eth1
> client configuration
> 
> If you could include the modprobe.conf entry, mount commands and
> anything else to try or verify with I'd appreciate it very much.
> Thanks in advance.




More information about the lustre-discuss mailing list