[Lustre-discuss] Added Dual-homed OSS; ethernet clients confused
Chris Worley
worleys at gmail.com
Mon Apr 21 20:31:37 PDT 2008
On Mon, Apr 21, 2008 at 9:22 PM, Chris Worley <worleys at gmail.com> wrote:
> The only configuration error on my OSS was: I initially only had
> "o2ib0(ib0)" in my modprobe.conf. After unmounting all the OSTs, and
> getting the modprobe.conf right:
>
> options lnet networks=o2ib0(ib0),tcp0(eth0)
>
> ...and remounting from scratch, both ksocklnd and ko2iblnd are now
> loaded properly.
>
> But, I still can't mount the partition on the ethernet-only client nodes.
>
> They get the error:
>
> LustreError: 8439:0:(events.c:401:ptlrpc_uuid_to_peer()) No NID found
> for 36.102.29.4 at o2ib
> LustreError: 8439:0:(client.c:58:ptlrpc_uuid_to_connection()) cannot
> find peer 36.102.29.4 at o2ib!
> LustreError: 8439:0:(ldlm_lib.c:312:client_obd_setup()) can't add
> initial connection
> LustreError: 8439:0:(obd_config.c:325:class_setup()) setup
> lfs-OST0026-osc-0000010753919000 failed (-2)
> LustreError: 8439:0:(obd_config.c:1062:class_config_llog_handler())
> Err -2 on cfg command:
> Lustre: cmd=cf003 0:lfs-OST0026-osc 1:lfs-OST0026_UUID 2:36.102.29.4 at o2ib
> LustreError: 15c-8: MGC36.101.29.1 at tcp: The configuration from log
> 'lfs-client' failed (-2).
>
> The 36.102.29.4 is the IPoIB address of the added OSS. It shouldn't
> want it "@o2ib".
>
> I've also unmounted all Lustre mounts on the MGS/MDS, unloaded all the
> modules and remounted. Still no joy.
>
>From this point forward, every time I say"OST" I mean "OSS"...
> The file systems were created on the new OST, just as on all the others:
>
> for i in b c d e f g h i j k l; do mkfs.lustre --ost
> --mgsnode="36.102.29.1 at o2ib0,36.101.29.1 at tcp0" --fsname=lfs --param
> sys.timeout=40 --param lov.stripesize=2M /dev/sd$i & done
>
> The client has the right modprobe.conf, which worked before the additional OST:
>
> options lnet networks=tcp0(eth0)
>
> ... and I'm using the same mount command that worked previously:
>
> mount -t lustre 36.101.29.1 at tcp:/lfs /lfs
>
> From the OST I can ping the client:
>
> # lctl list_nids
> 36.102.29.4 at o2ib
> 36.101.29.4 at tcp
> # lctl ping 36.101.255.10 at tcp
> 12345-0 at lo
> 12345-36.101.255.10 at tcp
>
> From the client, I can ping the OST and MDS/MGS:
>
> # lctl list_nids
> 36.101.255.10 at tcp
> # lctl ping 36.101.29.4 at tcp
> 12345-0 at lo
> 12345-36.102.29.4 at o2ib
> 12345-36.101.29.4 at tcp
> # lctl ping 36.101.29.1 at tcp
> 12345-0 at lo
> 12345-36.102.29.1 at o2ib
> 12345-36.101.29.1 at tcp
>
> So, somehow, not having the right modprobe.conf the first time I
> mounted the partitions on the new OST has made it permanently not want
> to mount properly on Ethernet clients (it mounts fine on IB clients).
>
> Any ideas?
>
> Thanks,
>
> Chris
>
More information about the lustre-discuss
mailing list