[lustre-discuss] Full OST

Wed Sep 22 01:41:20 PDT 2021

Hi all,

Some further developments, which we don't understand.

As files on this OST get written and deleted, it seems that they are 
removed from the MDS, but not actually deleted from the OST.  The OST then 
gradually fills up.

If we do a (on the mds):
lctl set_param osc.snap8-OST004e-*.active=0
lctl set_param osc.snap8-OST004e-*.active=1

it then immediately empties itself of all the removed files.

It then proceeds to fill up again as stuff is written and removed.

This is repeatable - we've been through this cycle twice now.

So the question is, why aren't the objects on the OST being deleted as 
expected?

Log messages from the MDS:
Sep 22 09:24:37 c8snapmds1 kernel: Lustre: setting import snap8-OST004e_UUID INACTIVE by administrator request
Sep 22 09:24:37 c8snapmds1 kernel: Lustre: Skipped 3 previous similar messages
Sep 22 09:24:39 c8snapmds1 kernel: Lustre: snap8-OST004e-osc-MDT0000: Connection to snap8-OST004e (at 172.18.185.50 at o2ib) was lost; in progress operations using this service will wait for recovery to complete
Sep 22 09:24:39 c8snapmds1 kernel: Lustre: Skipped 3 previous similar messages
Sep 22 09:24:40 c8snapmds1 kernel: LustreError: 4726:0:(import.c:1297:ptlrpc_connect_interpret()) snap8-OST004e_UUID went back in time (transno 98789932294 was previously committed, server now claims 12890210835)!  See https://bugzilla.lustre.org/show_bug.cgi?id=9646
Sep 22 09:24:40 c8snapmds1 kernel: LustreError: 167-0: snap8-OST004e-osc-MDT0000: This client was evicted by snap8-OST004e; in progress operations using this service will fail.
Sep 22 09:24:40 c8snapmds1 kernel: LustreError: Skipped 3 previous similar messages
Sep 22 09:24:40 c8snapmds1 kernel: Lustre: snap8-OST004e-osc-MDT0000: Connection restored to 172.18.185.80 at o2ib (at 172.18.185.50 at o2ib)
Sep 22 09:24:40 c8snapmds1 kernel: Lustre: Skipped 3 previous similar messages

And on the OSS:
Sep 22 09:24:39 c8snaposs10 kernel: Lustre: snap8-OST004e: Client snap8-MDT0000-mdtlov_UUID (at 172.18.185.40 at o2ib) reconnecting
Sep 22 09:24:39 c8snaposs10 kernel: Lustre: Skipped 3 previous similar messages
Sep 22 09:24:39 c8snaposs10 kernel: Lustre: snap8-OST004e: Connection restored to snap8-MDT0000-mdtlov_UUID (at 172.18.185.40 at o2ib)
Sep 22 09:24:40 c8snaposs10 kernel: Lustre: Skipped 3 previous similar messages
Sep 22 09:24:40 c8snaposs10 kernel: Lustre: snap8-OST004e: deleting orphan objects from 0x0:41496 to 0x0:41537
Sep 22 09:24:40 c8snaposs10 kernel: Lustre: snap8-OST004e: deleting orphan objects from 0x23c0000402:642 to 0x23c0000402:737
Sep 22 09:24:40 c8snaposs10 kernel: Lustre: snap8-OST004e: deleting orphan objects from 0x23c0000401:642 to 0x23c0000401:737
Sep 22 09:24:40 c8snaposs10 kernel: Lustre: snap8-OST004e: deleting orphan objects from 0x23c0000400:1517 to 0x23c0000400:1537

The OSS also contains other OSTs which aren't seeing any problems.

Lustre 2.12.6.

Thanks,
Alastair.

On Thu, 9 Sep 2021, Andreas Dilger wrote:

> [EXTERNAL EMAIL]
>
>
> On Sep 8, 2021, at 04:42, Alastair Basden <a.g.basden at durham.ac.uk<mailto:a.g.basden at durham.ac.uk>> wrote:
>
>
> Next step would be to unmount OST004e, run a full e2fsck, and then check lost+found and/or a regular "find /mnt/ost -type f -size +1M" or similar to find where the files are.
>
>
> Thanks.  e2fsck returns clean (on its own, with -p and with -f).
>
> Now, the find command does return a large number of files belonging to usera - and of sufficient size to fill up the disk.
>
> e.g. /mnt/ost/O/0/d3/29379 has a size 2.3G.
>
> If you run 'll_decode_filter_fid /mnt/ost/O/0/d3/29379' or 'debugfs -c -R "stat O/0/d3/29379" /dev/<ostdev>' it will print the *parent* (MDT) FID suitable for "lfs fid2path" on a client.  This probably won't work, but worth a try anyway.
>
> So it would seem that these files are getting deleted from the mds, but not from this OST.  Has this been seen before?  The other OSTs seem fine - stuff getting deleted as expected.
>
> Based on the very low object number, I would guess that these are old files and relate to some kind of issue seen in the past (e.g. MDT corruption where e2fsck cleared some inodes, or similar).  The "debugfs stat" command above will also print the object creation time along with the normal timestamps.
>
> Is it safe to simply remove all these files, and then remount etc?  How can we ensure that new files will be deleted from the OST in the future?
>
> If they are not referenced by any in-use file (per fid2path) then yes.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Whamcloud
>
>
>
>
>
>
>
>