[Lustre-discuss] Adding IB to tcp only cluster

Isaac Huang He.Huang at Sun.COM
Thu Oct 16 18:09:50 PDT 2008


On Sun, Oct 12, 2008 at 10:15:01AM -0400, Brock Palen wrote:
> ......
> Currently we don't put any lustre modules in modprobe.conf,  lustre  
> loads the correct modules when mounting the filesystem.  We do this  
> to keep our loads simple as we have several.

When nothing has been specified, LNet by default loads the ksocklnd,
which in turn by default uses the 1st usable interface returned by
SIOCGIFCONF.

> It would be nice if LNET picked IB without being told.  Similar to  
> the way OpenMPI has network weights of which to try first.

No such automatic mechanism exists. The LNet NIs can only be specified
statically via module options ('networks' or 'ip2nets').

As to choice of path for multi-homed LNet, the decision is solely
based on the destination NID. If the NID belongs to a local network
(e.g. 10.0.0.1 at o2ib0 is on my local network @o2ib0 if I have a NI in
@o2ib0 too), traffic would go through the local NI. If the NID is on a
remote network (e.g. 3 at ptl0 if I don't have a NI in @ptl0), a router
would be picked out among available routes based on load already
queued on routers, and the local NI to that router would be used for
outgoing traffic (e.g. the NI in @tcp0 would be used if
192.168.0.1 at tcp0 is the router chosen).

In other words, the LNet path from a multi-homed client to a
multi-homed server is determined by the server NID. For example, if
both the client and the server are on @tcp0 and @o2ib0, the client
would choose IB network if the server NID is in @o2ib0, and TCP
network otherwise. The server NID used by Lustre clients should
somehow come from the MGS but I'm not sure about it. LNet has no
knowledge about whether a peer is multi-homed, so it couldn't figure
out that the IB network is a better path to reach a peer in @tcp.

Isaac



More information about the lustre-discuss mailing list