[Lustre-discuss] Lnet configuration: 1 ost per gige interface.

Joe Georger jgeorger at ll.mit.edu
Wed Oct 22 05:00:23 PDT 2008

Even for bonding mode 6?


      balance-alb or 6 
          Adaptive load balancing: includes balance-tlb plus receive
          load balancing (rlb) for IPV4 traffic, and does not require
          any special switch support. The receive load balancing is
          achieved by ARP negotiation. 

    The bonding driver intercepts the ARP Replies sent by the local
    system on their way out and overwrites the source hardware address
    with the unique hardware address of one of the slaves in the bond
    such that different peers use different hardware addresses for the
    Receive traffic from connections created by the server is also
    balanced. When the local system sends an ARP Request the bonding
    driver copies and saves the peer's IP information from the ARP packet. 
    When the ARP Reply arrives from the peer, its hardware address is
    retrieved and the bonding driver initiates an ARP reply to this peer
    assigning it to one of the slaves in the bond. 
    A problematic outcome of using ARP negotiation for balancing is that
    each time that an ARP request is broadcast it uses the hardware
    address of the bond. Hence, peers learn the hardware address of the
    bond and the balancing of receive traffic collapses to the current
    slave. This is handled by sending updates (ARP Replies) to all the
    peers with their individually assigned hardware address such that
    the traffic is redistributed. Receive traffic is also redistributed
    when a new slave is added to the bond and when an inactive slave is
    re-activated. The receive load is distributed sequentially (round
    robin) among the group of highest speed slaves in the bond. 
    When a link is reconnected or a new slave joins the bond the receive
    traffic is redistributed among all active slaves in the bond by
    initiating ARP Replies with the selected mac address to each of the
    clients. The updelay parameter (detailed below) must be set to a
    value equal or greater than the switch's forwarding delay so that
    the ARP Replies sent to the peers will not be blocked by the switch. 

        * Prerequisites:
             1. Ethtool support in the base drivers for retrieving the
                speed of each slave.
             2. Base driver support for setting the hardware address of
                a device while it is open. This is required so that
                there will always be one slave in the team using the
                bond hardware address (the curr_active_slave) while
                having a unique hardware address for each slave in the
                bond. If the curr_active_slave fails its hardware
                address is swapped with the new curr_active_slave that
                was chosen.

Brian J. Murrell wrote:
> On Tue, 2008-10-21 at 12:15 -0500, Hendelman, Rob wrote:
>> I was under the impression that bonding nics required a manged switch to
>> support this.
> It does require a switch that supports link aggregation yes.  Sorry, I
> overlooked that you only had a dumb switch.
> b.

More information about the lustre-discuss mailing list