[Lustre-discuss] How to configure Redundant NICs over separate switches and subnets.

D. Marc Stearman marc at llnl.gov
Tue Jan 22 09:03:21 PST 2008


As far as I know, LNET will use the shortest path on the network, so  
if you have two equivalent tcp networks, tcp0 and tcp1, LNET will  
just  use the first one.  If it fails, it should use the second one.   
If both NICs are in the same tcp network, LNET should use both.   
Whether you decide on one or two LNET networks is up to you.   
Regardless, your fstab entry is not correct.  You should only list  
one server as the host:

> 192.168.136.81 at tcp0:/stage     /stage           lustre   
> defaults,_netdev 0 0

or

> 192.168.135.80 at tcp1:/stage     /stage           lustre   
> defaults,_netdev 0 0

If one NIC fails, while a client is not mounted, you would have to  
change the fstab to remount.  If lustre
is already mounted, it should just use the other LNET network.

-Marc

----
D. Marc Stearman
LC Lustre Systems Administrator
marc at llnl.gov
925.423.9670
Pager: 1.888.203.0641


On Jan 21, 2008, at 2:24 PM, Lundgren, Andrew wrote:

> I am still unclear on how this should be configured.
>
> All of my clients, OSS and MGS servers have two nics, on different  
> subnets, connected to different switches. This is done for network  
> redundancy.  We are also using aliased IP addresses for lustre in  
> the 192.168 address space.  We are not bonding interfaces.
>
> In my /etc/modprobe.conf file on all of the machines I have the  
> following line:
>
> options lnet networks=tcp0(eth1:0),tcp1(eth0:0)
>
> When I format my OSTs, I have used:
>
> mkfs.lustre  --fsname stage --ost --mgsnode=192.168.135.999 at tcp0 -- 
> param="failover.mode=failout" /dev/md6
>
> Where 999 is the IP on the aliased nic.  (Should I use both tcp0  
> and tcp1 with a comma?)
>
> When I mount the clients in my /etc/fstab, I have done this:
>
> 192.168.136.81 at tcp0,192.168.135.80 at tcp1:/stage     /stage            
> lustre  defaults,_netdev 0 0
>
> My intent is that if one path to the MGS fails, the second one will  
> be used.
>
> Am I doing this correct, or am I off base here?
>
> Thanks!
>
> --
> Andrew Lundgren
>



More information about the lustre-discuss mailing list