[lustre-discuss] Ongoing issues with quota

Daniel Szkola dszkola at fnal.gov
Mon Oct 9 07:30:31 PDT 2023


Is there really no way to force a recount of files used by the quota? All indications are we have accounts where files were removed and this is not reflected in the used file count in the quota. The space used seems correct but the inodes used numbers are way high. There must be a way to clear these numbers and have a fresh count done.

—
Dan Szkola
FNAL

> On Oct 4, 2023, at 11:37 AM, Daniel Szkola via lustre-discuss <lustre-discuss at lists.lustre.org> wrote:
> 
> Also, quotas on the OSTS don’t add up to near 3 million files either:
> 
> [root at lustreclient scratch]# ssh ossnode0 lfs quota -g somegroup -I 0 /lustre1
> Disk quotas for grp somegroup (gid 9544):
>     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
>                1394853459       0 1913344192       -  132863       0       0       -
> [root at lustreclient scratch]# ssh ossnode0 lfs quota -g somegroup -I 1 /lustre1
> Disk quotas for grp somegroup (gid 9544):
>     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
>                1411579601       0 1963246413       -  120643       0       0       -
> [root at lustreclient scratch]# ssh ossnode1 lfs quota -g somegroup -I 2 /lustre1
> Disk quotas for grp somegroup (gid 9544):
>     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
>                1416507527       0 1789950778       -  190687       0       0       -
> [root at lustreclient scratch]# ssh ossnode1 lfs quota -g somegroup -I 3 /lustre1
> Disk quotas for grp somegroup (gid 9544):
>     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
>                1636465724       0 1926578117       -  195034       0       0       -
> [root at lustreclient scratch]# ssh ossnode2 lfs quota -g somegroup -I 4 /lustre1
> Disk quotas for grp somegroup (gid 9544):
>     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
>                2202272244       0 3020159313       -  185097       0       0       -
> [root at lustreclient scratch]# ssh ossnode2 lfs quota -g somegroup -I 5 /lustre1
> Disk quotas for grp somegroup (gid 9544):
>     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
>                1324770165       0 1371244768       -  145347       0       0       -
> [root at lustreclient scratch]# ssh ossnode3 lfs quota -g somegroup -I 6 /lustre1
> Disk quotas for grp somegroup (gid 9544):
>     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
>                2892027349       0 3221225472       -  169386       0       0       -
> [root at lustreclient scratch]# ssh ossnode3 lfs quota -g somegroup -I 7 /lustre1
> Disk quotas for grp somegroup (gid 9544):
>     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
>                2076201636       0 2474853207       -  171552       0       0       -
> 
> 
>> Dan Szkola
> FNAL
> 
>> On Oct 4, 2023, at 8:45 AM, Daniel Szkola via lustre-discuss <lustre-discuss at lists.lustre.org> wrote:
>> 
>> No combination of ossnodek runs has helped with this.
>> 
>> Again, robinhood shows 1796104 files for the group, an 'lfs find -G gid' found 1796104 files as well.
>> 
>> So why is the quota command showing over 3 million inodes used?
>> 
>> There must be a way to force it to recount or clear all stale quota data and have it regenerate it?
>> 
>> Anyone?
>> 
>>>> Dan Szkola
>> FNAL
>> 
>> 
>>> On Sep 27, 2023, at 9:42 AM, Daniel Szkola via lustre-discuss <lustre-discuss at lists.lustre.org> wrote:
>>> 
>>> We have a lustre filesystem that we just upgraded to 2.15.3, however this problem has been going on for some time.
>>> 
>>> The quota command shows this:
>>> 
>>> Disk quotas for grp somegroup (gid 9544):
>>>   Filesystem    used   quota   limit   grace   files   quota   limit   grace
>>>     /lustre1  13.38T     40T     45T       - 3136761* 2621440 3670016 expired
>>> 
>>> The group is not using nearly that many files. We have robinhood installed and it show this:
>>> 
>>> Using config file '/etc/robinhood.d/lustre1.conf'.
>>>   group,     type,      count,     volume,   spc_used,   avg_size
>>> somegroup,   symlink,      59071,    5.12 MB,  103.16 MB,         91
>>> somegroup,       dir,     426619,    5.24 GB,    5.24 GB,   12.87 KB
>>> somegroup,      file,    1310414,   16.24 TB,   13.37 TB,   13.00 MB
>>> 
>>> Total: 1796104 entries, volume: 17866508365925 bytes (16.25 TB), space used: 14704924899840 bytes (13.37 TB)
>>> 
>>> Any ideas what is wrong here?
>>> 
>>>>>> Dan Szkola
>>> FNAL
>>> _______________________________________________
>>> lustre-discuss mailing list
>>> lustre-discuss at lists.lustre.org
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org&d=DwIGaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=e9DXjyTaQ786Tg7WH7oIVaQOA1YDRqyxHOUaYU2_LQw&m=Nk1MkSBTpT-KnrXzEvOOP5tZoVAKyHfPvB-o8_OhewuwHF6S0KelH_WPMLq8IRnR&s=JzAV0C2_CqaDUOG0wZr0mx5tiblBde6ZRUuIHZ2n9DI&e= 
>> 
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org&d=DwIGaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=e9DXjyTaQ786Tg7WH7oIVaQOA1YDRqyxHOUaYU2_LQw&m=k8TeSgok6MIb-uQMJaquDJS0FQPt0RQxysFNe4d7Rp5TMqGtcqdlezA_TZNuoTJS&s=SRDKhUKQgMW9_OohjyrkzKNYbzTw_M5BJk-bmEi_6w4&e= 
> 
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org&d=DwIGaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=e9DXjyTaQ786Tg7WH7oIVaQOA1YDRqyxHOUaYU2_LQw&m=_sWNS8gGwHELfA2miYibo9rXWlseRntRyLTylBl8XD_4FGlNHDlr_yMNWEFliCQI&s=GIG0RGUbqKoR1w0FlyJDo2SzxqMconUKd4KPHfpgPFs&e= 



More information about the lustre-discuss mailing list