[Lustre-discuss] lnet infiniband config

Erik Froese erik.froese at gmail.com
Tue Jun 22 09:40:28 PDT 2010


Hey Thomas,

Are you trying to connect to Lustre via IB and ethernet? If so your
modprobe config should look like this.
options lnet networks="o2ib0(ib0),tcp0(eth0)"

If you're IB only use.
options lnet networks="o2ib0(ib0)"

If your MDS and OSS servers are on a separate networks you'll need to
do something different.
Let's say the MDS and OSSs are on o2ib0/tcp0 and the clients are on
o2ib1/tcp1. You'll need a router server with separate addresses on
o2ib0 and o2ib1.

Also its important to note that o2ib0 and o2ib1 should be different IP
address spaces.

On the clients.
# I live on o2ib1
options lnet networks="o2ib1(ib0),tcp1(eth0)"
# To get to o2ib0 go through IP.ADD.OF.ROUTER at oi2ib1
options lnet routes="o2ib0 IP.ADD.OF.ROUTER at o2ib1"

On the servers
# I live on o2ib0
options lnet networks="o2ib0(ib0),tcp0(eth0)"
# To get to o2ib1 go through IP.ADD.OF.ROUTER at oi2ib0
options lnet routes="o2ib1 IP.ADD.OF.ROUTER at o2ib0"

IP.ADD.OF.ROUTER at oi2ib0 and IP.ADD.OF.ROUTER at oi2ib1 are different IPs
on distinct networks.

lctl list_nids will show you the lustre nids of the node you're logged
into only.
lctl route_list will show you the lustre routers and the networks that
they bridge.

I hope this was helpful.

Erik

On Tue, Jun 22, 2010 at 10:19 AM, Thomas Roth <t.roth at gsi.de> wrote:
> Hi all,
>
> I'm getting my feet wet in the infiniband lake and of course I run into
> some problems.
> It would seem I got the compilation part of sles11 kernel 2.6.27 +
> Lustre 1.8.3 + ofed 1.4.2 right, because it allows me to see and use the
> infiniband fabric, and because ko2iblnd loads without any complaints.
>
> In /etc/modprobe.d/lustre (this is a Debian system, hence this subdir of
> modprobe-configs), I have
>> options ip2nets="o2ib0 192.168.0.[1-5]"
> I load lnet and do 'lctl network up', but then 'lctl list_nids' will
> invariably give me only
>> 192.168.0.1 at tcp
> no matter how I twist the modprobe-config (ip2nets="o2ib",
> network="o2ib", network="o2ib(ib0), etc.)
>
> This is true as long as I have ib0 configured with the IP 192.168.0.1
> Once I unconfigure it, I get, quite expectedly,
> LNET configure error 100: Network is down
>
> So I can either configure ipoib and bring up the network, but using tcp,
> or I don't configure ib0 and then cannot start the network -? ;-{}  I
> think I'm rather missing something here.
> Any clues?
>
> Cheers,
> Thomas
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>



More information about the lustre-discuss mailing list