[Lustre-discuss] tcp0 for maximum effect
Klaus Steden
klaus.steden at technicolor.com
Mon Jan 12 14:15:51 PST 2009
Hi folks,
Lustre doesn't support any inherent link aggregation, it simply utilizes the
device node the OS presents. If this is a bonded NIC, it will use it no
problem, but the underlying device driver takes care of load balancing and
distribution.
I've used Lustre 1.6.x quite successfully with load-balanced 802.3ad
configurations; in some of my tests I was able to get about 350 MB/s
aggregate sustained across two OSS nodes with 2 x GigE bonded each.
802.3ad link aggregation is a standard NIC bonding protocol, and is
supported on all good quality L3 switches and by vendors like Cisco,
Foundry, Extreme, and Juniper.
cheers,
Klaus
On 1/11/09 11:37 AM, "Peter Grandi" <pg_lus at lus.for.sabi.co.UK> etched on
stone tablets:
>> I have two boxes that have this:
>
>> [root at lustrethree Desktop]# ifconfig
>> eth0 Link encap:Ethernet HWaddr 00:1B:21:2A:17:76
>> inet addr:192.168.0.19 Bcast:192.168.0.255 Mask:255.255.255.0
>> RX bytes:120168321 (114.6 MiB) TX bytes:5300070662 (4.9 GiB)
>> [ ... ]
>> eth1 Link encap:Ethernet HWaddr 00:1B:21:2A:1C:DC
>> inet addr:192.168.0.20 Bcast:192.168.0.255 Mask:255.255.255.0
>> RX bytes:55673426 (53.0 MiB) TX bytes:846 (846.0 b)
>> [ ... another 4 like that, 192.168.0.21-24 ... ]
>
> That's a very bizarre network configuration, you have 5
> interfaces on the same subnet (presumably all plugged into the
> same switch) with no load balancing, as all the outgoing traffic
> goes via 'eth0'.
>
> You have some better alternatives:
>
> * Use bonding (if the switch supports to ties together the 5
> interfaces as one virtual interface with a single IP address.
>
> * Use something like 'nexthop' routing (and a couple other
> tricks) to split the load across the several interfaces. This
> is easier for the outgoing traffic than the incoming traffic,
> but it seems you have a lot more outgoing traffic.
>
> * Use 1 10Gb/s card per server and a 1Gb/s switch with 2 10Gb/s
> ports. 10Gb/s cards and switches have fallen in price a lot
> recently (check Myri.com), and a server that can do several
> hundred MB/s really deserves a nice 10Gb/s interface.
>
> IIRC 'lnet' has something like bonding built in, but I am not
> sure that it handles multiple addresses in the same subnet well.
>
>> Would it be better to have these two boxes as OSS's or as MDT
>> or MGS machines? Currently they are configured 1 as a MGS and
>> the other as the MDT.
>
> If these are the two servers with gigantic disk arrays, I'd have
> on each both MDS and OSS. Possibly with the OSTs replicated
> across both machines in an active/passive configuration.
>
>> The question is does LNET use the available tcp0 connections
>> different from the OSS perspective as opposed to the MDT or
>> MGS perspective?
>
> Not sure that the question means.
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
More information about the lustre-discuss
mailing list