[lustre-discuss] Full OST

Alastair Basden a.g.basden at durham.ac.uk
Thu Sep 16 01:45:35 PDT 2021


Hi all,

We mounted as ext4, removed the files, and then remounted as lustre (and 
did the lfsck scans).

All seemed fine, and the OST went back into production.

However, it again has the same problem - it is filling up.  Currently
lfs df reports it as 89% full with 4.8TB used.

However, an lfs find --ost=... can only account for 268GB.

So I again suspect that there are unlinked/deleted files, which aren't 
actually being deleted.

Does anyone have any idea how to get it deleting files correctly?  All the 
other OSTs are behaving perfectly fine (including those served by the same 
OSS).

Cheers,
Alastair.



On Thu, 9 Sep 2021, Andreas Dilger wrote:

> [EXTERNAL EMAIL]
>
>
> On Sep 8, 2021, at 04:42, Alastair Basden <a.g.basden at durham.ac.uk<mailto:a.g.basden at durham.ac.uk>> wrote:
>
>
> Next step would be to unmount OST004e, run a full e2fsck, and then check lost+found and/or a regular "find /mnt/ost -type f -size +1M" or similar to find where the files are.
>
>
> Thanks.  e2fsck returns clean (on its own, with -p and with -f).
>
> Now, the find command does return a large number of files belonging to usera - and of sufficient size to fill up the disk.
>
> e.g. /mnt/ost/O/0/d3/29379 has a size 2.3G.
>
> If you run 'll_decode_filter_fid /mnt/ost/O/0/d3/29379' or 'debugfs -c -R "stat O/0/d3/29379" /dev/<ostdev>' it will print the *parent* (MDT) FID suitable for "lfs fid2path" on a client.  This probably won't work, but worth a try anyway.
>
> So it would seem that these files are getting deleted from the mds, but not from this OST.  Has this been seen before?  The other OSTs seem fine - stuff getting deleted as expected.
>
> Based on the very low object number, I would guess that these are old files and relate to some kind of issue seen in the past (e.g. MDT corruption where e2fsck cleared some inodes, or similar).  The "debugfs stat" command above will also print the object creation time along with the normal timestamps.
>
> Is it safe to simply remove all these files, and then remount etc?  How can we ensure that new files will be deleted from the OST in the future?
>
> If they are not referenced by any in-use file (per fid2path) then yes.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Whamcloud
>
>
>
>
>
>
>
>


More information about the lustre-discuss mailing list