[Lustre-discuss] Added Dual-homed OSS; ethernet clients confused

Chris Worley worleys at gmail.com
Mon Apr 21 20:31:37 PDT 2008


On Mon, Apr 21, 2008 at 9:22 PM, Chris Worley <worleys at gmail.com> wrote:
> The only configuration error on my OSS was: I initially only had
>  "o2ib0(ib0)" in my modprobe.conf.  After unmounting all the OSTs, and
>  getting the modprobe.conf right:
>
>    options lnet networks=o2ib0(ib0),tcp0(eth0)
>
>  ...and remounting from scratch, both ksocklnd and ko2iblnd are now
>  loaded properly.
>
>  But, I still can't mount the partition on the ethernet-only client nodes.
>
>  They get the error:
>
>  LustreError: 8439:0:(events.c:401:ptlrpc_uuid_to_peer()) No NID found
>  for 36.102.29.4 at o2ib
>  LustreError: 8439:0:(client.c:58:ptlrpc_uuid_to_connection()) cannot
>  find peer 36.102.29.4 at o2ib!
>  LustreError: 8439:0:(ldlm_lib.c:312:client_obd_setup()) can't add
>  initial connection
>  LustreError: 8439:0:(obd_config.c:325:class_setup()) setup
>  lfs-OST0026-osc-0000010753919000 failed (-2)
>  LustreError: 8439:0:(obd_config.c:1062:class_config_llog_handler())
>  Err -2 on cfg command:
>  Lustre:    cmd=cf003 0:lfs-OST0026-osc  1:lfs-OST0026_UUID  2:36.102.29.4 at o2ib
>  LustreError: 15c-8: MGC36.101.29.1 at tcp: The configuration from log
>  'lfs-client' failed (-2).
>
>  The 36.102.29.4 is the IPoIB address of the added OSS.  It shouldn't
>  want it "@o2ib".
>
>  I've also unmounted all Lustre mounts on the MGS/MDS, unloaded all the
>  modules and remounted.  Still no joy.
>

>From this point forward, every time I say"OST" I mean "OSS"...

>  The file systems were created on the new OST, just as on all the others:
>
>  for i  in b c d e f g h i j k l; do mkfs.lustre --ost
>  --mgsnode="36.102.29.1 at o2ib0,36.101.29.1 at tcp0" --fsname=lfs --param
>  sys.timeout=40 --param lov.stripesize=2M /dev/sd$i & done
>
>  The client has the right modprobe.conf, which worked before the additional OST:
>
>   options lnet networks=tcp0(eth0)
>
>  ... and I'm using the same mount command that worked previously:
>
>   mount -t lustre 36.101.29.1 at tcp:/lfs /lfs
>
>  From the OST I can ping the client:
>
>  # lctl list_nids
>  36.102.29.4 at o2ib
>  36.101.29.4 at tcp
>  # lctl ping 36.101.255.10 at tcp
>  12345-0 at lo
>  12345-36.101.255.10 at tcp
>
>  From the client, I can ping the OST and MDS/MGS:
>
>  # lctl list_nids
>  36.101.255.10 at tcp
>  # lctl ping 36.101.29.4 at tcp
>  12345-0 at lo
>  12345-36.102.29.4 at o2ib
>  12345-36.101.29.4 at tcp
>  # lctl ping 36.101.29.1 at tcp
>  12345-0 at lo
>  12345-36.102.29.1 at o2ib
>  12345-36.101.29.1 at tcp
>
>  So, somehow, not having the right modprobe.conf the first time I
>  mounted the partitions on the new OST has made it permanently not want
>  to mount properly on Ethernet clients (it mounts fine on IB clients).
>
>  Any ideas?
>
>  Thanks,
>
>  Chris
>



More information about the lustre-discuss mailing list