[Lustre-discuss] Multi-Rail Configurations on a Multi-Port IB HCA

Dardo D Kleiner - CONTRACTOR dkleiner at cmf.nrl.navy.mil
Mon Nov 16 13:38:03 PST 2009


Stand down.  Don't know what was wrong with my configuration at first,
but it does instantiate the two NIDs on the host with multiple ports
on a single HCA.  Unfortunately,

LustreError: 17771:0:(router.c:464:lnet_check_routes()) Routes to o2ib1 via xxx.xxx.182.193 at o2ib4 and xxx.xxx.182.129 at o2ib3 not supported

So I couldn't have done what I wanted to anyway, the answer to my
question below "Should I be able to route over multiple lnets?" is
clearly no...

- Dardo

Dardo D Kleiner - CONTRACTOR wrote:
> Isaac Huang wrote:
>> On Fri, Nov 13, 2009 at 03:34:14PM -0500, Dardo D Kleiner - CONTRACTOR wrote:
>>> Mellanox ConnectX MT25418, two ports, each connected to a separate
>>> IB fabric - ib0 and ib1 have distinct IP subnets, each connected
>>> to a separate Lustre router.
>>> ......
>>> ip ad ls:
>>> 4: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc pfifo_fast state UP qlen 4096
>>>      inet xxx.xxx.182.130/26 brd xxx.xxx.182.191 scope global ib0
>>> 5: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc pfifo_fast state UP qlen 4096
>>>      inet xxx.xxx.182.194/26 brd xxx.xxx.182.255 scope global ib1
>>>
>>> /etc/modprobe.d/lustre:
>>> options lnet \
>>>          ip2nets=" \
>>>                  o2ib1 xxx.xxx.[176-177].[0-255];
>>>                  o2ib3(ib0) xxx.xxx.182.[128-191];
>>>                  o2ib4(ib1) xxx.xxx.182.[192-255]"
>>>          routes=" \
>>>                  o2ib1 xxx.xxx.182.129 at o2ib3,xxx.xxx.182.193 at o2ib4"
>>>
>>> dmesg:
>>> .
>>> .
>>> Lustre: Listener bound to ib0:xxx.xxx.182.130:987:mlx4_0
>>> .
>>> .
>>>
>>>
>>> Why don't I also get "Listener bound to ib1:xxx.xxx.182.194:987:mlx4_0"?
>> What did 'lctl list_nids' show? It looked like only one NI was
>> initialized.
> 
> Only the one o2ib3 NID was listed, I did check that.  So its your belief that
> I should have two distinct NIDs here?  Should I be able to route over multiple
> lnets?  On systems that have two HCA's I certainly do see multiple NIDs, this
> is the first system I've configured with one HCA that has two ports...
> 
> The filesystem wouldn't mount with this configuration, obviously.  One other bit
> of information is that it also wouldn't work if I only specified o2ib4(ib1),
> without the o2ib3(ib0) line (though now I realize I didn't to try set the
> ko2iblnd ipif_name to ib1 in that test).  It does work if I only have the
> o2ib3 lnet definition.
> 
> - Dardo
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
> 
> 



More information about the lustre-discuss mailing list