[lustre-discuss] How to make OSTs active again
Pinkesh Valdria
pinkesh.valdria at oracle.com
Thu Sep 30 11:12:42 PDT 2021
I have a simple lustre setup ( 1 MGS, 1 MDS (2 MDT), 2 OSS (2 OST each) and 1 client node to run some IO load). I was testing what happens if one of the OSS dies (but no impact to data). To recover from failed OSS, I create a new instance and attached the 2 OSTs from failed node. I assume, since I am using existing OSTs from failed node and the index will remain the same, I tried directly mount of it like below:
mount -t lustre /dev/oracleoci/oraclevdb /mnt/oss-2-ost-1
mount -t lustre /dev/oracleoci/oraclevdc /mnt/oss-2-ost-2
Since I tried many different time, I also tried the below:
Ran mkfs.lustre on the OSTs:
mkfs.lustre --fsname=lustrefs --index=2 --ost --mgsnode=10.0.6.2 at tcp1 /dev/oracleoci/oraclevdb
mkfs.lustre --fsname=lustrefs --index=3 --ost --mgsnode=10.0.6.2 at tcp1 /dev/oracleoci/oraclevdc
mount -t lustre /dev/oracleoci/oraclevdb /mnt/oss-2-ost-1
mount -t lustre /dev/oracleoci/oraclevdc /mnt/oss-2-ost-2
Ran mkfs.lustre on the OSTs with --reformat --replace
mkfs.lustre --fsname=lustrefs --reformat --replace --index=2 --ost --mgsnode=10.0.6.2 at tcp1 /dev/oracleoci/oraclevdb
mount -t lustre /dev/oracleoci/oraclevdb /mnt/oss-2-ost-1
mkfs.lustre --fsname=lustrefs --reformat --replace --index=3 --ost --mgsnode=10.0.6.2 at tcp1 /dev/oracleoci/oraclevdc
mount -t lustre /dev/oracleoci/oraclevdc /mnt/oss-2-ost-2
Questions:
1. After OSS node was replaced, the client node mount was still in hang state and I had to reboot the client node for the mount to work. Is there some config I need to set , so it auto-recovers.
2. On the client node, I see the 2 OSTs are showing as INACTIVE, how do I make them active again. I read on forums to do “lctl –device <device_name> recover/activate and I ran that on MDS and Client, and it still shows INACTIVE. It was confusing on what to pass as <device_name> and where to find the correct name.
[root at client-1 ~]# lfs osts
OBDS:
0: lustrefs-OST0000_UUID ACTIVE
1: lustrefs-OST0001_UUID ACTIVE
2: lustrefs-OST0002_UUID INACTIVE
3: lustrefs-OST0003_UUID INACTIVE
[root at client-1 ~]# lctl dl
0 UP mgc MGC10.0.6.2 at tcp1 0e4fae60-66e5-963d-1aea-59b80f9fd77b 4
1 UP lov lustrefs-clilov-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 3
2 UP lmv lustrefs-clilmv-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4
3 UP mdc lustrefs-MDT0000-mdc-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4
4 UP mdc lustrefs-MDT0001-mdc-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4
5 UP osc lustrefs-OST0002-osc-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4
6 UP osc lustrefs-OST0003-osc-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4
7 UP osc lustrefs-OST0000-osc-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4
8 UP osc lustrefs-OST0001-osc-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4
[root at client-1 ~]#
MDS node
$ sudo lctl dl
0 UP osd-ldiskfs lustrefs-MDT0001-osd lustrefs-MDT0001-osd_UUID 10
1 UP osd-ldiskfs lustrefs-MDT0000-osd lustrefs-MDT0000-osd_UUID 11
2 UP mgc MGC10.0.6.2 at tcp1 acc3160e-9975-9262-89e1-8dc66812ac94 4
3 UP mds MDS MDS_uuid 2
4 UP lod lustrefs-MDT0000-mdtlov lustrefs-MDT0000-mdtlov_UUID 3
5 UP mdt lustrefs-MDT0000 lustrefs-MDT0000_UUID 18
6 UP mdd lustrefs-MDD0000 lustrefs-MDD0000_UUID 3
7 UP qmt lustrefs-QMT0000 lustrefs-QMT0000_UUID 3
8 UP osp lustrefs-MDT0001-osp-MDT0000 lustrefs-MDT0000-mdtlov_UUID 4
9 UP osp lustrefs-OST0002-osc-MDT0000 lustrefs-MDT0000-mdtlov_UUID 4
10 UP osp lustrefs-OST0003-osc-MDT0000 lustrefs-MDT0000-mdtlov_UUID 4
11 UP osp lustrefs-OST0000-osc-MDT0000 lustrefs-MDT0000-mdtlov_UUID 4
12 UP osp lustrefs-OST0001-osc-MDT0000 lustrefs-MDT0000-mdtlov_UUID 4
13 UP lwp lustrefs-MDT0000-lwp-MDT0000 lustrefs-MDT0000-lwp-MDT0000_UUID 4
14 UP lod lustrefs-MDT0001-mdtlov lustrefs-MDT0001-mdtlov_UUID 3
15 UP mdt lustrefs-MDT0001 lustrefs-MDT0001_UUID 14
16 UP mdd lustrefs-MDD0001 lustrefs-MDD0001_UUID 3
17 UP osp lustrefs-MDT0000-osp-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4
18 UP osp lustrefs-OST0002-osc-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4
19 UP osp lustrefs-OST0003-osc-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4
20 UP osp lustrefs-OST0000-osc-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4
21 UP osp lustrefs-OST0001-osc-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4
22 UP lwp lustrefs-MDT0000-lwp-MDT0001 lustrefs-MDT0000-lwp-MDT0001_UUID 4
Thanks,
Pinkesh Valdria
Oracle Cloud Infrastructure
+65-8932-3639 (m) - Singapore
+1-425-205-7834 (m) - USA
https://blogs.oracle.com/author/pinkesh-valdria
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20210930/d09a374c/attachment-0001.html>
More information about the lustre-discuss
mailing list