[lustre-discuss] OSTffff created :-(

Fri May 25 10:22:12 PDT 2018

You might try to --writeconf your system?

The data seen by 'tunefs.lustre' as 'Permanent disk data', are they now actually on-disk? In other
words, if you run 'tunefs.lustre --dryrun' on that OST, does it now have 'lustre-OST000f'?

Afaik, such a change in parameters can only be propagated to all servers by umount, 'tunefs.lustre
--writeconf ...' on all targets and then restart.

However, this is just a rough guess.
For your clients to lose Lustre like this, something ultra-weird must be going on on the MDS with that
ffff.

Hmm, did you try to deactivate OST000f, too?

Regards,
Thomas

Btw, our 2.5.3 MDS also shows deactivated OSTs as 'UP' in the lctl-dl listing.

On 05/23/2018 03:04 PM, Torsten Harenberg wrote:
> Dear all,
> 
> we are running a Lustre 2.5.3 installation for a couple of years
> already. The devices come from the 3PAR SAN appliance.
> 
> Our users asked us to enlarge the available disk space, so we exported
> two new LUNs to the OST servers.
> 
> File systems have been created with:
> 
>  mkfs.lustre --fsname=lustre --ost --index 15 --backfstype=ldiskfs
> --failnode=<IP>@tcp --mgsnode=<IP>@tcp
> --mgsnode=<IP>@tcp --verbose /dev/mapper/OST000F
> 
> which went fine.
> 
> However, after mounting, the file system appears as
> 
> lustre-OSTffff_UUID   8585168804    35177704  8120481472   0%
> /lustre[OST:65535]
> 
> in lfs df.
> 
> And lfs df prints 65k+ lines with
> 
> OSTfff5             : Resource temporarily unavailable
> OSTfff6             : Resource temporarily unavailable
> OSTfff7             : Resource temporarily unavailable
> OSTfff8             : Resource temporarily unavailable
> OSTfff9             : Resource temporarily unavailable
> OSTfffa             : Resource temporarily unavailable
> OSTfffb             : Resource temporarily unavailable
> OSTfffc             : Resource temporarily unavailable
> OSTfffd             : Resource temporarily unavailable
> OSTfffe             : Resource temporarily unavailable
> 
> in between.
> 
> Searching for the root of this, we saw:
> 
> ------
> [root at lustre4 ~]# tunefs.lustre /dev/mapper/OST000F
> checking for existing Lustre data: found
> Reading CONFIGS/mountdata
> 
>    Read previous values:
> Target:     lustre-OSTffff
> Index:      15
> Lustre FS:  lustre
> Mount type: ldiskfs
> Flags:      0x2
>               (OST )
> Persistent mount opts: errors=remount-ro
> Parameters: failover.node=<IP>@tcp
> mgsnode=<IP>@tcp mgsnode=<IP>@tcp
> 
> 
>    Permanent disk data:
> Target:     lustre-OST000f
> Index:      15
> Lustre FS:  lustre
> Mount type: ldiskfs
> Flags:      0x2
>               (OST )
> Persistent mount opts: errors=remount-ro
> Parameters: failover.node=<IP>@tcp
> mgsnode=<IP>@tcp mgsnode=<IP>@tcp
> ------
> 
> 
> No idea where the
> 
>    Read previous values:
> Target:     lustre-OSTffff
> 
> comes from.
> 
> Now we were trying to free the OST immediately, which turns out to be
> more complicated than expected.
> 
> We tried to follow the manual and issued on the MDS:
> 
> [root at lustre1 ~]# lctl --device lustre-OSTffff-osc-MDT0000 deactivate
> 
> But device still is "UP":
> 
> [root at lustre1 ~]# lctl dl
>   0 UP osd-ldiskfs lustre-MDT0000-osd lustre-MDT0000-osd_UUID 24
>   1 UP mgs MGS MGS 427
>   2 UP mgc MGC132.195.124.201 at tcp 17eb290e-d0a6-2047-3250-84f893ebc47a 5
>   3 UP mds MDS MDS_uuid 3
>   4 UP lod lustre-MDT0000-mdtlov lustre-MDT0000-mdtlov_UUID 4
>   5 UP mdt lustre-MDT0000 lustre-MDT0000_UUID 455
>   6 UP mdd lustre-MDD0000 lustre-MDD0000_UUID 4
>   7 UP qmt lustre-QMT0000 lustre-QMT0000_UUID 4
>   8 UP osp lustre-OST0000-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
>   9 UP osp lustre-OST0001-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
>  10 UP osp lustre-OST0002-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
>  11 UP osp lustre-OST0003-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
>  12 UP osp lustre-OST0004-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
>  13 UP osp lustre-OST0005-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
>  14 UP osp lustre-OST0006-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
>  15 UP osp lustre-OST0007-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
>  16 UP osp lustre-OST0008-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
>  17 UP osp lustre-OST0009-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
>  18 UP osp lustre-OST000a-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
>  19 UP osp lustre-OST000b-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
>  20 UP osp lustre-OST000c-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
>  21 UP osp lustre-OST000d-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
>  22 UP osp lustre-OST000e-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
>  23 UP osp lustre-OSTffff-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
>  24 UP lwp lustre-MDT0000-lwp-MDT0000 lustre-MDT0000-lwp-MDT0000_UUID 5
> 
> We set it degraded on the OST:
> 
> [root at lustre4 ~]# lctl get_param obdfilter.*.degraded
> obdfilter.lustre-OST0008.degraded=0
> obdfilter.lustre-OST0009.degraded=0
> obdfilter.lustre-OST000a.degraded=0
> obdfilter.lustre-OST000b.degraded=0
> obdfilter.lustre-OST000c.degraded=0
> obdfilter.lustre-OST000d.degraded=0
> obdfilter.lustre-OST000e.degraded=0
> obdfilter.lustre-OSTffff.degraded=1
> 
> 
> But still the file system usage grows:
> 
> [root at wnfg001 ~]# lfs df  /lustre | grep ffff
> lustre-OSTffff_UUID   8585168804    35159988  8120496592   0%
> /lustre[OST:65535]
> [root at wnfg001 ~]# lfs df  /lustre | grep ffff
> lustre-OSTffff_UUID   8585168804    35177704  8120481472   0%
> /lustre[OST:65535]
> 
> 
> We could stop usage by setting if inactive on ALL (200+ in our case)
> clients with
> 
> lctl set_param osc.lustre-OSTffff-*.active=0
> 
> But then the file system becomes unusable for the users:
> 
> -bash-4.1# touch
> /lustre/gridsoft/arc/session/LeENDmtfhfsnsBfJnpimw0EmABFKDmABFKDmxSGKDmABFKDmhbxd6n/qq2
> touch: setting times of
> `/lustre/gridsoft/arc/session/LeENDmtfhfsnsBfJnpimw0EmABFKDmABFKDmxSGKDmABFKDmhbxd6n/qq2':
> Cannot send after transport endpoint shutdown
> 
> same is true for "lctl --device XX deactivate".
> 
> 
> So we are looking for ways now to:
> 
> 1.) set the OST read-only but keeping the file system usable
> 2.) then migrate what's on this OSTffff (we started a lfs find already,
> but it takes very long)
> 3.) remove the OST  and start from scratch.
> 
> And really nice would be to understand where the OSTffff comes from and
> how one can avoid it.
> 
> 
> Any hint is really appreciated.
> 
> Best regards
> 
>  Torsten
> 
> 
> 

-- 
--------------------------------------------------------------------
Thomas Roth
Department: Informationstechnologie
Location: SB3 2.291
Phone: +49-6159-71 1453  Fax: +49-6159-71 2986

GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1
64291 Darmstadt
www.gsi.de

Gesellschaft mit beschränkter Haftung
Sitz der Gesellschaft: Darmstadt
Handelsregister: Amtsgericht Darmstadt, HRB 1528

Geschäftsführung: Ursula Weyrich
Professor Dr. Paolo Giubellino
Jörg Blaurock

Vorsitzende des Aufsichtsrates: St Dr. Georg Schütte
Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt