[Lustre-discuss] Multihomed question: want Lustre over IB and Ethernet
Charles Taylor
taylor at hpc.ufl.edu
Fri Mar 7 02:40:23 PST 2008
Do "lclt list_nids" on your mds and oss's. They should look
something like this.
[root at hpcmds ~]# lctl list_nids
10.13.24.40 at o2ib
10.13.16.40 at tcp
Then your clients should have a nid on one or the other.
Check your dmesg output after loading lnet. The complaints are
pretty useful. Your modprobe.conf line looks correct although we
found we did not need all the quoting so you should check that as
well. Ours looks like...
options lnet networks=o2ib(ib0),tcp(eth0)
My guess is that it either cannot find or does not like your ko2iblnd
module.
ct
On Mar 7, 2008, at 12:46 AM, Chris Worley wrote:
> Most everything is over IB, but I have a few systems I'd like to mount
> the Lustre fs over GigE.
>
> I think I've followed the Multihomed instructions correctly, in:
>
> http://dlc.sun.com/pdf/820-3681/820-3681.pdf
>
> My /etc/modprobe.conf on mds/mgs/oss servers (which all have both
> Ethernet and IB) includes:
>
> options lnet 'networks="tcp0(eth0),o2ib0(ib0)"'
>
> I make and mount the mdt with (which has both IB and Ethernet, subnet
> 36.122.x.x is IB, 36.121.x.x is Ethernet):
>
> # mkfs.lustre --mdt --mgs
> --mgsnode="36.122.255.201 at o2ib0,36.121.255.201 at tcp0" <... > /dev/md0
> # mount -t lustre /dev/md0 /lfs/mdtb
>
> But, at this point, the ksocklnd module is loaded rather than the
> ko2iblnd module!
>
> On the OSS, I make the fs w/ the same "msgnode", but, when I try to
> mount it, it correctly uses the IB interface, but can't contact the
> MDS:
>
> LustreError: 27520:0:(events.c:401:ptlrpc_uuid_to_peer()) No NID found
> for MGC36.122.255.201 at o2ib_0
> LustreError: 27520:0:(client.c:58:ptlrpc_uuid_to_connection()) cannot
> find peer MGC36.122.255.201 at o2ib_0!
> LustreError: 27520:0:(ldlm_lib.c:312:client_obd_setup()) can't add
> initial connection
> LustreError: 17126:0:(connection.c:142:ptlrpc_put_connection())
> NULL connection
> LustreError: 27520:0:(obd_config.c:325:class_setup()) setup
> MGC36.122.255.201 at o2ib failed (-2)
> LustreError: 27520:0:(obd_mount.c:454:lustre_start_simple())
> MGC36.122.255.201 at o2ib setup error -2
> LustreError: 27520:0:(obd_mount.c:1368:server_put_super()) no obd
> ddnlfs-OSTffff
> LustreError: 27520:0:(obd_mount.c:119:server_deregister_mount())
> ddnlfs-OSTffff not registered
>
> It too has loaded the ksocklnd module, and not the ko2iblnd module. I
> guess that both modules should be loaded in a multihomed case?
>
> What am I doing wrong?
>
> Thanks,
>
> Chris
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
More information about the lustre-discuss
mailing list