[lustre-discuss] Ongoing issues with quota

Daniel Szkola dszkola at fnal.gov
Wed Oct 4 07:14:28 PDT 2023


Hi Mark,

All nodes are using ZFS. OSTs, MDT, and MGT are all ZFS-based, so there's
really no way to fsck them. I could do a scrub, but that's not the same
thing. Is there a Lustre/ZFS equivalent of 'tune2fs -O [^]quota' for ZFS?

I'm guessing that at some point, a large number of files was removed and
somehow quota accounting missed this.

There should be a simple way to reconcile or regenerate what quota has
recorded vs what is actually on disk, which I have verified two different
ways. 

--
Dan

On Wed, 2023-10-04 at 15:01 +0100, Mark Dixon wrote:
> Hi Dan,
> 
> I think it gets corrected when you umount and fsck the OST's themselves 
> (not lfsck). At least I recall seeing such messages when fsck'ing on 2.12.
> 
> Best,
> 
> Mark
> 
> On Wed, 4 Oct 2023, Daniel Szkola via lustre-discuss wrote:
> 
> > [EXTERNAL EMAIL]
> > 
> > No combination of lfsck runs has helped with this.
> > 
> > Again, robinhood shows 1796104 files for the group, an 'lfs find -G gid'
> > found 1796104 files as well.
> > 
> > So why is the quota command showing over 3 million inodes used?
> > 
> > There must be a way to force it to recount or clear all stale quota data
> > and have it regenerate it?
> > 
> > Anyone?
> > 
> > —
> > Dan Szkola
> > FNAL
> > 
> > 
> > > On Sep 27, 2023, at 9:42 AM, Daniel Szkola via lustre-discuss
> > > <lustre-discuss at lists.lustre.org> wrote:
> > > 
> > > We have a lustre filesystem that we just upgraded to 2.15.3, however
> > > this problem has been going on for some time.
> > > 
> > > The quota command shows this:
> > > 
> > > Disk quotas for grp somegroup (gid 9544):
> > >     Filesystem    used   quota   limit   grace   files   quota  
> > > limit   grace
> > >       /lustre1  13.38T     40T     45T       - 3136761* 2621440
> > > 3670016 expired
> > > 
> > > The group is not using nearly that many files. We have robinhood
> > > installed and it show this:
> > > 
> > > Using config file '/etc/robinhood.d/lustre1.conf'.
> > >     group,     type,      count,     volume,   spc_used,   avg_size
> > > somegroup,   symlink,      59071,    5.12 MB,  103.16 MB,         91
> > > somegroup,       dir,     426619,    5.24 GB,    5.24 GB,   12.87 KB
> > > somegroup,      file,    1310414,   16.24 TB,   13.37 TB,   13.00 MB
> > > 
> > > Total: 1796104 entries, volume: 17866508365925 bytes (16.25 TB), space
> > > used: 14704924899840 bytes (13.37 TB)
> > > 
> > > Any ideas what is wrong here?
> > > 
> > > —
> > > Dan Szkola
> > > FNAL
> > > _______________________________________________
> > > lustre-discuss mailing list
> > > lustre-discuss at lists.lustre.org
> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org&d=DwIGaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=e9DXjyTaQ786Tg7WH7oIVaQOA1YDRqyxHOUaYU2_LQw&m=Nk1MkSBTpT-KnrXzEvOOP5tZoVAKyHfPvB-o8_OhewuwHF6S0KelH_WPMLq8IRnR&s=JzAV0C2_CqaDUOG0wZr0mx5tiblBde6ZRUuIHZ2n9DI&e=
> > 
> > _______________________________________________
> > lustre-discuss mailing list
> > lustre-discuss at lists.lustre.org
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org&d=DwIDaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=e9DXjyTaQ786Tg7WH7oIVaQOA1YDRqyxHOUaYU2_LQw&m=BBVt50ufoqbL64MfSKVa87fK1B4Q0n91KVNJVmvb-9q9xOYwnzpZcOXWgUeM6fxQ&s=uTJ98MgxxcM61HIDJRBpfJpuLDt9Ug4ARh8P_Api3xQ&e=
> >  



More information about the lustre-discuss mailing list