[lustre-discuss] [SPAMMY (6.924)] Lustre in HA-LVM Cluster issue

Sun Aug 25 09:51:18 PDT 2019

Hello Team,
Could you please help me out here.

-Udai

From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> On Behalf Of Udai Sharma
Sent: Friday, August 23, 2019 4:40 PM
To: lustre-discuss at lists.lustre.org
Subject: [SPAMMY (6.924)][lustre-discuss] Lustre in HA-LVM Cluster issue

Hi Team,

Starting with topology and configurations:

---------------------------
Topology:

[HA1]< -----[N3]----- >[N4]
     |                |               |
     |                |              |
      -----[Client]----------

[N1,N2] = HA1  --> OSTs
 N3 --- > MGS
 N4 --- > MDT

N1 -> 3 Logical volumes [OST1,OST2,OST3]
N2 -> 3 Logical volumes [OST4,OST5,OST6]
N3 -> 1 Logical volume  [MGT1]
N4 -> 1 Logical volume  [MDT1]
------------------------------------------

N3 [MGS]

Created Zpool, formatted and Mounted it.

zpool create -f -O canmount=off -o multihost=on -o cachefile=none lustre  /dev/mgs/mgs01
mkfs.lustre --reformat --mgs --backfstype=zfs lustre/mgs01
mount.lustre lustre/mgs01 /mnt/mgs/

------------------------------------------

N4 [MDT]

Created Zpool, formatted and Mounted it.

zpool create -f -O canmount=off -o multihost=on -o cachefile=none lustre /dev/mdt/mdt01
mkfs.lustre  --reformat --mdt --fsname=lustre --index=0 --mgsnode=10.2.2.202 at tcp1<mailto:--mgsnode=10.2.2.202 at tcp1> --backfstype=zfs lustre/mdt01
mount.lustre lustre/mdt01 /mnt/mdt

----------------------------------------

HA1 [HA-LVM system]

N1 [OST1,OST2,OST3]

Created Zpool, formatted and Mounted it.
zpool create lustre -f -O canmount=off -o multihost=on -o cachefile=none /dev/vg_e/thinvolume1 /dev/vg_e/thinvolume2 /dev/vg_e/thinvolume3
mkfs.lustre  --reformat --ost --backfstype=zfs --fsname=lustre --index=111 --mgsnode=10.2.2.202 at tcp1<mailto:--mgsnode=10.2.2.202 at tcp1> --servicenode=10.2.2.239 at tcp1:10.2.2.241 at tcp1<mailto:--servicenode=10.2.2.239 at tcp1:10.2.2.241 at tcp1> lustre/ost01  ; mount.lustre lustre/ost01 /mnt/ost01/
mkfs.lustre  --reformat --ost --backfstype=zfs --fsname=lustre --index=222 --mgsnode=10.2.2.202 at tcp1<mailto:--mgsnode=10.2.2.202 at tcp1> --servicenode=10.2.2.239 at tcp1:10.2.2.241 at tcp1<mailto:--servicenode=10.2.2.239 at tcp1:10.2.2.241 at tcp1> lustre/ost02  ; mount.lustre lustre/ost02 /mnt/ost02/
mkfs.lustre  --reformat --ost --backfstype=zfs --fsname=lustre --index=333 --mgsnode=10.2.2.202 at tcp1<mailto:--mgsnode=10.2.2.202 at tcp1> --servicenode=10.2.2.239 at tcp1:10.2.2.241 at tcp1<mailto:--servicenode=10.2.2.239 at tcp1:10.2.2.241 at tcp1> lustre/ost03  ; mount.lustre lustre/ost03 /mnt/ost03/
df -h | grep lustre
lustre/ost01    287G  3.0M  287G   1% /mnt/ost01
lustre/ost02    287G  3.0M  287G   1% /mnt/ost02
lustre/ost03    287G  3.0M  287G   1% /mnt/ost03

N2 [OST4,OST5,OST6]

Created Zpool, formatted and Mounted it.
zpool create -f -O canmount=off -o multihost=on -o cachefile=none lustre  /dev/vg_p/thinvolume1 /dev/vg_p/thinvolume2 /dev/vg_p/thinvolume3
mkfs.lustre  --reformat --ost --backfstype=zfs --fsname=lustre --index=444 --mgsnode=10.2.2.202 at tcp1<mailto:--mgsnode=10.2.2.202 at tcp1> --servicenode=10.2.2.239 at tcp1:10.2.2.241 at tcp1<mailto:--servicenode=10.2.2.239 at tcp1:10.2.2.241 at tcp1> lustre/ost04 ; mount.lustre lustre/ost04 /mnt/ost04
mkfs.lustre  --reformat --ost --backfstype=zfs --fsname=lustre --index=555 --mgsnode=10.2.2.202 at tcp1<mailto:--mgsnode=10.2.2.202 at tcp1> --servicenode=10.2.2.239 at tcp1:10.2.2.241 at tcp1<mailto:--servicenode=10.2.2.239 at tcp1:10.2.2.241 at tcp1> lustre/ost05 ; mount.lustre lustre/ost05 /mnt/ost05
mkfs.lustre  --reformat --ost --backfstype=zfs --fsname=lustre --index=666 --mgsnode=10.2.2.202 at tcp1<mailto:--mgsnode=10.2.2.202 at tcp1> --servicenode=10.2.2.239 at tcp1:10.2.2.241 at tcp1<mailto:--servicenode=10.2.2.239 at tcp1:10.2.2.241 at tcp1> lustre/ost06 ; mount.lustre lustre/ost06 /mnt/ost06

df -h | grep lustre
lustre/ost04    287G  3.0M  287G   1% /mnt/ost04
lustre/ost05    287G  3.0M  287G   1% /mnt/ost05
lustre/ost06    287G  3.0M  287G   1% /mnt/ost06

Created PCS cluster over HA1.

Resource Group: electron
     vg_e       (ocf::heartbeat:LVM):   Started gp-electron
     zfs-pool-electron  (ocf::heartbeat:ZFS):   Started electron
     lustre-ost1        (ocf::heartbeat:Lustre):        Started electron
     lustre-ost2        (ocf::heartbeat:Lustre):        Started electron
     lustre-ost3        (ocf::heartbeat:Lustre):        Started electron
 Resource Group: proton
     vg_p       (ocf::heartbeat:LVM):   Started gp-proton
     zfs-pool-proton    (ocf::heartbeat:ZFS):   Started proton
     lustre-ost4        (ocf::heartbeat:Lustre):        Started proton
     lustre-ost5        (ocf::heartbeat:Lustre):        Started proton
     lustre-ost6        (ocf::heartbeat:Lustre):        Started proton

----------------------------------------

Client:

# mount | grep lustre
10.2.2.202 at tcp1:/lustre<mailto:10.2.2.202 at tcp1:/lustre> on /lustre type lustre (rw,lazystatfs)

#lfs osts
OBDS:
1: lustre-OST0001_UUID INACTIVE
2: lustre-OST0002_UUID INACTIVE
3: lustre-OST0003_UUID INACTIVE
4: lustre-OST0004_UUID INACTIVE
5: lustre-OST0005_UUID INACTIVE
6: lustre-OST0006_UUID INACTIVE
10: lustre-OST000a_UUID INACTIVE
11: lustre-OST000b_UUID INACTIVE
20: lustre-OST0014_UUID INACTIVE
22: lustre-OST0016_UUID INACTIVE
30: lustre-OST001e_UUID INACTIVE
33: lustre-OST0021_UUID INACTIVE
40: lustre-OST0028_UUID INACTIVE
44: lustre-OST002c_UUID INACTIVE
50: lustre-OST0032_UUID INACTIVE
55: lustre-OST0037_UUID INACTIVE
60: lustre-OST003c_UUID INACTIVE
66: lustre-OST0042_UUID INACTIVE
111: lustre-OST006f_UUID ACTIVE
222: lustre-OST00de_UUID ACTIVE
333: lustre-OST014d_UUID ACTIVE
444: lustre-OST01bc_UUID ACTIVE
555: lustre-OST022b_UUID ACTIVE
666: lustre-OST029a_UUID ACTIVE

# lfs mdts
MDTS:
0: lustre-MDT0000_UUID ACTIVE

OSTs are part of active-passive HA-LVM cluster. As per configuration, in case of resource failure, RA moves the resource to other node. It's working fine.

Issues seen:

  1.  OST00* are going to INACTIVE state if the corresponding disk in unmounted from the OST server, it never becomes active even-though if it mounted again. Had to format every time with -index to get the volumes listed again. Hence, so many inactive nodes in the 'lctl dl' output.
  2.  Recovery_status at N4[MDT] is always inactive. How to enable recovery, in case of failure.
  3.  How to re-activate the INACTIVE objects at OSTs?
  4.  In case of failover of HA resource services gets moved to other peer, but IOs, iozone or dd, never resumes.

Please advise.

PS: lctl 2.12.2

Thanks in advance.
Udai

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20190825/83ffe0bd/attachment-0001.html>