[lustre-discuss] How to make OSTs active again

Pinkesh Valdria pinkesh.valdria at oracle.com
Thu Sep 30 11:12:42 PDT 2021

I have a simple lustre setup ( 1 MGS,  1 MDS (2 MDT),  2 OSS (2 OST each) and 1 client node to run some IO load).   I was testing what happens if one of the OSS dies (but no impact to data).  To recover from failed OSS, I create a new instance and attached the 2 OSTs from failed node.   I assume, since I am using existing OSTs from failed node and the index will remain the same,  I tried directly mount of it like below:

mount -t lustre /dev/oracleoci/oraclevdb /mnt/oss-2-ost-1
mount -t lustre /dev/oracleoci/oraclevdc /mnt/oss-2-ost-2

Since I tried many different time,  I also tried the below:
Ran mkfs.lustre on the OSTs:
mkfs.lustre --fsname=lustrefs  --index=2 --ost --mgsnode= at tcp1  /dev/oracleoci/oraclevdb
mkfs.lustre --fsname=lustrefs  --index=3 --ost --mgsnode= at tcp1  /dev/oracleoci/oraclevdc
mount -t lustre /dev/oracleoci/oraclevdb /mnt/oss-2-ost-1
mount -t lustre /dev/oracleoci/oraclevdc /mnt/oss-2-ost-2

Ran mkfs.lustre on the OSTs with  --reformat --replace
mkfs.lustre --fsname=lustrefs --reformat --replace --index=2  --ost --mgsnode= at tcp1  /dev/oracleoci/oraclevdb
mount -t lustre /dev/oracleoci/oraclevdb /mnt/oss-2-ost-1

mkfs.lustre --fsname=lustrefs --reformat --replace --index=3  --ost --mgsnode= at tcp1  /dev/oracleoci/oraclevdc
mount -t lustre /dev/oracleoci/oraclevdc /mnt/oss-2-ost-2


  1.  After OSS node was replaced,  the client node mount was still in hang state and I had to reboot the client node for the mount to work.  Is there some config I need to set , so it auto-recovers.
  2.  On the client node,  I see the 2 OSTs are showing as INACTIVE,   how do I make them active again.   I read on forums to do “lctl –device <device_name> recover/activate and I ran that on MDS and Client, and it still shows INACTIVE.   It was confusing on what to pass as <device_name> and where to find the correct name.

[root at client-1 ~]# lfs osts
0: lustrefs-OST0000_UUID ACTIVE
1: lustrefs-OST0001_UUID ACTIVE
2: lustrefs-OST0002_UUID INACTIVE
3: lustrefs-OST0003_UUID INACTIVE

[root at client-1 ~]# lctl dl
  0 UP mgc MGC10.0.6.2 at tcp1 0e4fae60-66e5-963d-1aea-59b80f9fd77b 4
  1 UP lov lustrefs-clilov-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 3
  2 UP lmv lustrefs-clilmv-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4
  3 UP mdc lustrefs-MDT0000-mdc-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4
  4 UP mdc lustrefs-MDT0001-mdc-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4
  5 UP osc lustrefs-OST0002-osc-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4
  6 UP osc lustrefs-OST0003-osc-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4
  7 UP osc lustrefs-OST0000-osc-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4
  8 UP osc lustrefs-OST0001-osc-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4
[root at client-1 ~]#

MDS node
$ sudo lctl dl
  0 UP osd-ldiskfs lustrefs-MDT0001-osd lustrefs-MDT0001-osd_UUID 10
  1 UP osd-ldiskfs lustrefs-MDT0000-osd lustrefs-MDT0000-osd_UUID 11
  2 UP mgc MGC10.0.6.2 at tcp1 acc3160e-9975-9262-89e1-8dc66812ac94 4
  3 UP mds MDS MDS_uuid 2
  4 UP lod lustrefs-MDT0000-mdtlov lustrefs-MDT0000-mdtlov_UUID 3
  5 UP mdt lustrefs-MDT0000 lustrefs-MDT0000_UUID 18
  6 UP mdd lustrefs-MDD0000 lustrefs-MDD0000_UUID 3
  7 UP qmt lustrefs-QMT0000 lustrefs-QMT0000_UUID 3
  8 UP osp lustrefs-MDT0001-osp-MDT0000 lustrefs-MDT0000-mdtlov_UUID 4
  9 UP osp lustrefs-OST0002-osc-MDT0000 lustrefs-MDT0000-mdtlov_UUID 4
10 UP osp lustrefs-OST0003-osc-MDT0000 lustrefs-MDT0000-mdtlov_UUID 4
11 UP osp lustrefs-OST0000-osc-MDT0000 lustrefs-MDT0000-mdtlov_UUID 4
12 UP osp lustrefs-OST0001-osc-MDT0000 lustrefs-MDT0000-mdtlov_UUID 4
13 UP lwp lustrefs-MDT0000-lwp-MDT0000 lustrefs-MDT0000-lwp-MDT0000_UUID 4
14 UP lod lustrefs-MDT0001-mdtlov lustrefs-MDT0001-mdtlov_UUID 3
15 UP mdt lustrefs-MDT0001 lustrefs-MDT0001_UUID 14
16 UP mdd lustrefs-MDD0001 lustrefs-MDD0001_UUID 3
17 UP osp lustrefs-MDT0000-osp-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4
18 UP osp lustrefs-OST0002-osc-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4
19 UP osp lustrefs-OST0003-osc-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4
20 UP osp lustrefs-OST0000-osc-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4
21 UP osp lustrefs-OST0001-osc-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4
22 UP lwp lustrefs-MDT0000-lwp-MDT0001 lustrefs-MDT0000-lwp-MDT0001_UUID 4

Pinkesh Valdria
Oracle Cloud Infrastructure
+65-8932-3639 (m) - Singapore
+1-425-205-7834 (m) - USA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20210930/d09a374c/attachment-0001.html>

More information about the lustre-discuss mailing list