[Lustre-discuss] Fwd: Multihomed question: want Lustre over IB and Ethernet
Chris Worley
worleys at gmail.com
Fri Mar 7 06:41:31 PST 2008
I changed my modprobe.conf to look exactly as yours, and it worked. I
hadn't been using all the quotes until the doc said to... but they may
have indeed been the problem.
Thanks!
Chris
On Fri, Mar 7, 2008 at 3:40 AM, Charles Taylor <taylor at hpc.ufl.edu> wrote:
>
>
> Do "lclt list_nids" on your mds and oss's. They should look
> something like this.
>
> [root at hpcmds ~]# lctl list_nids
> 10.13.24.40 at o2ib
> 10.13.16.40 at tcp
>
> Then your clients should have a nid on one or the other.
>
> Check your dmesg output after loading lnet. The complaints are
> pretty useful. Your modprobe.conf line looks correct although we
> found we did not need all the quoting so you should check that as
> well. Ours looks like...
>
> options lnet networks=o2ib(ib0),tcp(eth0)
>
> My guess is that it either cannot find or does not like your ko2iblnd
> module.
>
> ct
>
>
>
> On Mar 7, 2008, at 12:46 AM, Chris Worley wrote:
>
> > Most everything is over IB, but I have a few systems I'd like to mount
> > the Lustre fs over GigE.
> >
> > I think I've followed the Multihomed instructions correctly, in:
> >
> > http://dlc.sun.com/pdf/820-3681/820-3681.pdf
> >
> > My /etc/modprobe.conf on mds/mgs/oss servers (which all have both
> > Ethernet and IB) includes:
> >
> > options lnet 'networks="tcp0(eth0),o2ib0(ib0)"'
> >
> > I make and mount the mdt with (which has both IB and Ethernet, subnet
> > 36.122.x.x is IB, 36.121.x.x is Ethernet):
> >
> > # mkfs.lustre --mdt --mgs
> > --mgsnode="36.122.255.201 at o2ib0,36.121.255.201 at tcp0" <... > /dev/md0
> > # mount -t lustre /dev/md0 /lfs/mdtb
> >
> > But, at this point, the ksocklnd module is loaded rather than the
> > ko2iblnd module!
> >
> > On the OSS, I make the fs w/ the same "msgnode", but, when I try to
> > mount it, it correctly uses the IB interface, but can't contact the
> > MDS:
> >
> > LustreError: 27520:0:(events.c:401:ptlrpc_uuid_to_peer()) No NID found
> > for MGC36.122.255.201 at o2ib_0
> > LustreError: 27520:0:(client.c:58:ptlrpc_uuid_to_connection()) cannot
> > find peer MGC36.122.255.201 at o2ib_0!
> > LustreError: 27520:0:(ldlm_lib.c:312:client_obd_setup()) can't add
> > initial connection
> > LustreError: 17126:0:(connection.c:142:ptlrpc_put_connection())
> > NULL connection
> > LustreError: 27520:0:(obd_config.c:325:class_setup()) setup
> > MGC36.122.255.201 at o2ib failed (-2)
> > LustreError: 27520:0:(obd_mount.c:454:lustre_start_simple())
> > MGC36.122.255.201 at o2ib setup error -2
> > LustreError: 27520:0:(obd_mount.c:1368:server_put_super()) no obd
> > ddnlfs-OSTffff
> > LustreError: 27520:0:(obd_mount.c:119:server_deregister_mount())
> > ddnlfs-OSTffff not registered
> >
> > It too has loaded the ksocklnd module, and not the ko2iblnd module. I
> > guess that both modules should be loaded in a multihomed case?
> >
> > What am I doing wrong?
> >
> > Thanks,
> >
> > Chris
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
More information about the lustre-discuss
mailing list