[lustre-discuss] OSTffff created :-(

Wed May 23 06:04:44 PDT 2018

Dear all,

we are running a Lustre 2.5.3 installation for a couple of years
already. The devices come from the 3PAR SAN appliance.

Our users asked us to enlarge the available disk space, so we exported
two new LUNs to the OST servers.

File systems have been created with:

 mkfs.lustre --fsname=lustre --ost --index 15 --backfstype=ldiskfs
--failnode=<IP>@tcp --mgsnode=<IP>@tcp
--mgsnode=<IP>@tcp --verbose /dev/mapper/OST000F

which went fine.

However, after mounting, the file system appears as

lustre-OSTffff_UUID   8585168804    35177704  8120481472   0%
/lustre[OST:65535]

in lfs df.

And lfs df prints 65k+ lines with

OSTfff5             : Resource temporarily unavailable
OSTfff6             : Resource temporarily unavailable
OSTfff7             : Resource temporarily unavailable
OSTfff8             : Resource temporarily unavailable
OSTfff9             : Resource temporarily unavailable
OSTfffa             : Resource temporarily unavailable
OSTfffb             : Resource temporarily unavailable
OSTfffc             : Resource temporarily unavailable
OSTfffd             : Resource temporarily unavailable
OSTfffe             : Resource temporarily unavailable

in between.

Searching for the root of this, we saw:

------
[root at lustre4 ~]# tunefs.lustre /dev/mapper/OST000F
checking for existing Lustre data: found
Reading CONFIGS/mountdata

   Read previous values:
Target:     lustre-OSTffff
Index:      15
Lustre FS:  lustre
Mount type: ldiskfs
Flags:      0x2
              (OST )
Persistent mount opts: errors=remount-ro
Parameters: failover.node=<IP>@tcp
mgsnode=<IP>@tcp mgsnode=<IP>@tcp

   Permanent disk data:
Target:     lustre-OST000f
Index:      15
Lustre FS:  lustre
Mount type: ldiskfs
Flags:      0x2
              (OST )
Persistent mount opts: errors=remount-ro
Parameters: failover.node=<IP>@tcp
mgsnode=<IP>@tcp mgsnode=<IP>@tcp
------

No idea where the

   Read previous values:
Target:     lustre-OSTffff

comes from.

Now we were trying to free the OST immediately, which turns out to be
more complicated than expected.

We tried to follow the manual and issued on the MDS:

[root at lustre1 ~]# lctl --device lustre-OSTffff-osc-MDT0000 deactivate

But device still is "UP":

[root at lustre1 ~]# lctl dl
  0 UP osd-ldiskfs lustre-MDT0000-osd lustre-MDT0000-osd_UUID 24
  1 UP mgs MGS MGS 427
  2 UP mgc MGC132.195.124.201 at tcp 17eb290e-d0a6-2047-3250-84f893ebc47a 5
  3 UP mds MDS MDS_uuid 3
  4 UP lod lustre-MDT0000-mdtlov lustre-MDT0000-mdtlov_UUID 4
  5 UP mdt lustre-MDT0000 lustre-MDT0000_UUID 455
  6 UP mdd lustre-MDD0000 lustre-MDD0000_UUID 4
  7 UP qmt lustre-QMT0000 lustre-QMT0000_UUID 4
  8 UP osp lustre-OST0000-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
  9 UP osp lustre-OST0001-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
 10 UP osp lustre-OST0002-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
 11 UP osp lustre-OST0003-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
 12 UP osp lustre-OST0004-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
 13 UP osp lustre-OST0005-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
 14 UP osp lustre-OST0006-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
 15 UP osp lustre-OST0007-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
 16 UP osp lustre-OST0008-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
 17 UP osp lustre-OST0009-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
 18 UP osp lustre-OST000a-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
 19 UP osp lustre-OST000b-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
 20 UP osp lustre-OST000c-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
 21 UP osp lustre-OST000d-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
 22 UP osp lustre-OST000e-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
 23 UP osp lustre-OSTffff-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
 24 UP lwp lustre-MDT0000-lwp-MDT0000 lustre-MDT0000-lwp-MDT0000_UUID 5

We set it degraded on the OST:

[root at lustre4 ~]# lctl get_param obdfilter.*.degraded
obdfilter.lustre-OST0008.degraded=0
obdfilter.lustre-OST0009.degraded=0
obdfilter.lustre-OST000a.degraded=0
obdfilter.lustre-OST000b.degraded=0
obdfilter.lustre-OST000c.degraded=0
obdfilter.lustre-OST000d.degraded=0
obdfilter.lustre-OST000e.degraded=0
obdfilter.lustre-OSTffff.degraded=1

But still the file system usage grows:

[root at wnfg001 ~]# lfs df  /lustre | grep ffff
lustre-OSTffff_UUID   8585168804    35159988  8120496592   0%
/lustre[OST:65535]
[root at wnfg001 ~]# lfs df  /lustre | grep ffff
lustre-OSTffff_UUID   8585168804    35177704  8120481472   0%
/lustre[OST:65535]

We could stop usage by setting if inactive on ALL (200+ in our case)
clients with

lctl set_param osc.lustre-OSTffff-*.active=0

But then the file system becomes unusable for the users:

-bash-4.1# touch
/lustre/gridsoft/arc/session/LeENDmtfhfsnsBfJnpimw0EmABFKDmABFKDmxSGKDmABFKDmhbxd6n/qq2
touch: setting times of
`/lustre/gridsoft/arc/session/LeENDmtfhfsnsBfJnpimw0EmABFKDmABFKDmxSGKDmABFKDmhbxd6n/qq2':
Cannot send after transport endpoint shutdown

same is true for "lctl --device XX deactivate".

So we are looking for ways now to:

1.) set the OST read-only but keeping the file system usable
2.) then migrate what's on this OSTffff (we started a lfs find already,
but it takes very long)
3.) remove the OST  and start from scratch.

And really nice would be to understand where the OSTffff comes from and
how one can avoid it.

Any hint is really appreciated.

Best regards

 Torsten

-- 
Dr. Torsten Harenberg     harenberg at physik.uni-wuppertal.de
Bergische Universitaet
Fakultät 4 - Physik       Tel.: +49 (0)202 439-3521
Gaussstr. 20              Fax : +49 (0)202 439-2811
42097 Wuppertal