[Lustre-discuss] tcp network load balancing understanding lustre 1.8

Isaac Huang He.Huang at Sun.COM
Thu May 7 12:57:55 PDT 2009


On Thu, May 07, 2009 at 02:50:13PM +0200, Michael Ruepp wrote:
> Hi there,
> ......
> I give every NID a IP in the same subnet, eg: 10.111.20.35-38 - oss0  
> and 10.111.20.39-42 oss1
> 
> Do I have to make modprobe.conf.local look like this to force lustre  
> to use all four interfaces parallel:
> 
> options lnet networks=tcp0(eth0,eth1,eth2,eth3)
> Because on Page 138 the 1.8 Manual says:
> "Note ? In the case of TCP-only clients, the first available non- 
> loopback IP interface
> is used for tcp0 since the interfaces are not specified. "

Correct.

> or do I have to specify it like this:
> options lnet networks=tcp
> Because on Page 112 the lustre 1.6 Manual says:
> "Note ? In the case of TCP-only clients, all available IP interfaces  
> are used for tcp0

Wrong. It needs to be updated as well, Sheila?

> ......
> My goal ist to let lustre utilize all four Gb Links parallel. And my  
> Lustre Clients are equipped with two Gb links which should be utilized  
> by the lustre clients as well (eth0, eth1)
> 
> Or is bonding the better solution in terms of performance?

I don't have any performance comparisons between the two approaches,
but I'd suggest to go with Linux bonding instead (let's call the
tcp0(eth0,...ethN) approach Lustre bonding), because:
1. With Lustre bonding it's rather tricky to get routing right,
especially when all NICs reside in a same IP subnet. Lustre tcp
network driver, as its name suggests, works at TCP layer and the
decision as to which outgoing interface to use depends on Linux IP
layer routing. When all NICs live in a same IP subnet, it's very
possible that all outgoing packets would go through the interface of
the 1st route in the Linux routing table, unless some tweaking has
been done to also take source IPs into account. Incoming packets could
also come in via unexpected NICs, depending on your settings in
/proc/sys/net/ipv4/conf/*/arp_ignore and your ethernet topology.

2. Linux bonding does a good job of detecting link status via either
the ARP monitor or the MII monitor, but no such mechanism exists in
Lustre bonding.

In fact, the Lustre bonding is an officially obsoleted feature if I
remember correctly.

Thanks,
Isaac



More information about the lustre-discuss mailing list