[lustre-discuss] trouble mounting after a tunefs

John White jwhite at lbl.gov
Fri Jun 12 08:07:01 PDT 2015


Good Morning Folks,
	We recently had to add TCP NIDs to an existing o2ib FS.  We added the nid to the modprobe.d stuff and tossed the definition of the NID in the failnode and mgsnode params on all OSTs and the MGS + MDT.  When either an o2ib or tcp client try to mount, the mount command hangs and dmesg repeats:
LustreError: 11-0: brc-MDT0000-mdc-ffff881036879c00: Communicating with 10.4.250.10 at o2ib, operation mds_connect failed with -11.

I fear we may have over-done the parameters, could anyone take a look here and let me know if we need to fix things up (remove params, etc)?

MGS:
Read previous values:
Target:     MGS
Index:      unassigned
Lustre FS:  
Mount type: ldiskfs
Flags:      0x4
              (MGS )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters:

MDT:
 Read previous values:
Target:     brc-MDT0000
Index:      0
Lustre FS:  brc
Mount type: ldiskfs
Flags:      0x1001
              (MDT no_primnode )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters:  mgsnode=10.4.250.11 at o2ib,10.0.250.11 at tcp:10.4.250.10 at o2ib,10.0.250.10 at tcp  failover.node=10.4.250.10 at o2ib,10.0.250.10 at tcp:10.4.250.11 at o2ib,10.0.250.11 at tcp mdt.quota_type=ug

OST(sample):
Read previous values:
Target:     brc-OST0002
Index:      2
Lustre FS:  brc
Mount type: ldiskfs
Flags:      0x1002
              (OST no_primnode )
Persistent mount opts: errors=remount-ro
Parameters:  mgsnode=10.4.250.10 at o2ib,10.0.250.10 at tcp:10.4.250.11 at o2ib,10.0.250.11 at tcp  failover.node=10.4.250.12 at o2ib,10.0.250.12 at tcp:10.4.250.13 at o2ib,10.0.250.13 at tcp ost.quota_type=ug


More information about the lustre-discuss mailing list