[lustre-discuss] Ongoing issues with quota

Mark Dixon mark.c.dixon at durham.ac.uk
Wed Oct 4 07:40:43 PDT 2023


Hi Dan,

Ah, I see. Sorry, no idea - it's been a few years since I last used ZFS, 
and I've never used the Lustre ZFS backend.

Regards,

Mark

On Wed, 4 Oct 2023, Daniel Szkola wrote:

> [EXTERNAL EMAIL]
>
> Hi Mark,
>
> All nodes are using ZFS. OSTs, MDT, and MGT are all ZFS-based, so there's
> really no way to fsck them. I could do a scrub, but that's not the same
> thing. Is there a Lustre/ZFS equivalent of 'tune2fs -O [^]quota' for ZFS?
>
> I'm guessing that at some point, a large number of files was removed and
> somehow quota accounting missed this.
>
> There should be a simple way to reconcile or regenerate what quota has
> recorded vs what is actually on disk, which I have verified two different
> ways.
>
> --
> Dan
>
> On Wed, 2023-10-04 at 15:01 +0100, Mark Dixon wrote:
>> Hi Dan,
>>
>> I think it gets corrected when you umount and fsck the OST's themselves
>> (not lfsck). At least I recall seeing such messages when fsck'ing on 2.12.
>>
>> Best,
>>
>> Mark
>>
>> On Wed, 4 Oct 2023, Daniel Szkola via lustre-discuss wrote:
>>
>>> [EXTERNAL EMAIL]
>>>
>>> No combination of lfsck runs has helped with this.
>>>
>>> Again, robinhood shows 1796104 files for the group, an 'lfs find -G gid'
>>> found 1796104 files as well.
>>>
>>> So why is the quota command showing over 3 million inodes used?
>>>
>>> There must be a way to force it to recount or clear all stale quota data
>>> and have it regenerate it?
>>>
>>> Anyone?
>>>
>>>>>> Dan Szkola
>>> FNAL
>>>
>>>
>>>> On Sep 27, 2023, at 9:42 AM, Daniel Szkola via lustre-discuss
>>>> <lustre-discuss at lists.lustre.org> wrote:
>>>>
>>>> We have a lustre filesystem that we just upgraded to 2.15.3, however
>>>> this problem has been going on for some time.
>>>>
>>>> The quota command shows this:
>>>>
>>>> Disk quotas for grp somegroup (gid 9544):
>>>>     Filesystem    used   quota   limit   grace   files   quota
>>>> limit   grace
>>>>       /lustre1  13.38T     40T     45T       - 3136761* 2621440
>>>> 3670016 expired
>>>>
>>>> The group is not using nearly that many files. We have robinhood
>>>> installed and it show this:
>>>>
>>>> Using config file '/etc/robinhood.d/lustre1.conf'.
>>>>     group,     type,      count,     volume,   spc_used,   avg_size
>>>> somegroup,   symlink,      59071,    5.12 MB,  103.16 MB,         91
>>>> somegroup,       dir,     426619,    5.24 GB,    5.24 GB,   12.87 KB
>>>> somegroup,      file,    1310414,   16.24 TB,   13.37 TB,   13.00 MB
>>>>
>>>> Total: 1796104 entries, volume: 17866508365925 bytes (16.25 TB), space
>>>> used: 14704924899840 bytes (13.37 TB)
>>>>
>>>> Any ideas what is wrong here?
>>>>
>>>>>>>> Dan Szkola
>>>> FNAL
>>>> _______________________________________________
>>>> lustre-discuss mailing list
>>>> lustre-discuss at lists.lustre.org
>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org&d=DwIGaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=e9DXjyTaQ786Tg7WH7oIVaQOA1YDRqyxHOUaYU2_LQw&m=Nk1MkSBTpT-KnrXzEvOOP5tZoVAKyHfPvB-o8_OhewuwHF6S0KelH_WPMLq8IRnR&s=JzAV0C2_CqaDUOG0wZr0mx5tiblBde6ZRUuIHZ2n9DI&e=
>>>
>>> _______________________________________________
>>> lustre-discuss mailing list
>>> lustre-discuss at lists.lustre.org
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org&d=DwIDaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=e9DXjyTaQ786Tg7WH7oIVaQOA1YDRqyxHOUaYU2_LQw&m=BBVt50ufoqbL64MfSKVa87fK1B4Q0n91KVNJVmvb-9q9xOYwnzpZcOXWgUeM6fxQ&s=uTJ98MgxxcM61HIDJRBpfJpuLDt9Ug4ARh8P_Api3xQ&e=
>>>
>
>


More information about the lustre-discuss mailing list