[lustre-discuss] Ongoing issues with quota
Mark Dixon
mark.c.dixon at durham.ac.uk
Wed Oct 4 07:40:43 PDT 2023
Hi Dan,
Ah, I see. Sorry, no idea - it's been a few years since I last used ZFS,
and I've never used the Lustre ZFS backend.
Regards,
Mark
On Wed, 4 Oct 2023, Daniel Szkola wrote:
> [EXTERNAL EMAIL]
>
> Hi Mark,
>
> All nodes are using ZFS. OSTs, MDT, and MGT are all ZFS-based, so there's
> really no way to fsck them. I could do a scrub, but that's not the same
> thing. Is there a Lustre/ZFS equivalent of 'tune2fs -O [^]quota' for ZFS?
>
> I'm guessing that at some point, a large number of files was removed and
> somehow quota accounting missed this.
>
> There should be a simple way to reconcile or regenerate what quota has
> recorded vs what is actually on disk, which I have verified two different
> ways.
>
> --
> Dan
>
> On Wed, 2023-10-04 at 15:01 +0100, Mark Dixon wrote:
>> Hi Dan,
>>
>> I think it gets corrected when you umount and fsck the OST's themselves
>> (not lfsck). At least I recall seeing such messages when fsck'ing on 2.12.
>>
>> Best,
>>
>> Mark
>>
>> On Wed, 4 Oct 2023, Daniel Szkola via lustre-discuss wrote:
>>
>>> [EXTERNAL EMAIL]
>>>
>>> No combination of lfsck runs has helped with this.
>>>
>>> Again, robinhood shows 1796104 files for the group, an 'lfs find -G gid'
>>> found 1796104 files as well.
>>>
>>> So why is the quota command showing over 3 million inodes used?
>>>
>>> There must be a way to force it to recount or clear all stale quota data
>>> and have it regenerate it?
>>>
>>> Anyone?
>>>
>>> —
>>> Dan Szkola
>>> FNAL
>>>
>>>
>>>> On Sep 27, 2023, at 9:42 AM, Daniel Szkola via lustre-discuss
>>>> <lustre-discuss at lists.lustre.org> wrote:
>>>>
>>>> We have a lustre filesystem that we just upgraded to 2.15.3, however
>>>> this problem has been going on for some time.
>>>>
>>>> The quota command shows this:
>>>>
>>>> Disk quotas for grp somegroup (gid 9544):
>>>> Filesystem used quota limit grace files quota
>>>> limit grace
>>>> /lustre1 13.38T 40T 45T - 3136761* 2621440
>>>> 3670016 expired
>>>>
>>>> The group is not using nearly that many files. We have robinhood
>>>> installed and it show this:
>>>>
>>>> Using config file '/etc/robinhood.d/lustre1.conf'.
>>>> group, type, count, volume, spc_used, avg_size
>>>> somegroup, symlink, 59071, 5.12 MB, 103.16 MB, 91
>>>> somegroup, dir, 426619, 5.24 GB, 5.24 GB, 12.87 KB
>>>> somegroup, file, 1310414, 16.24 TB, 13.37 TB, 13.00 MB
>>>>
>>>> Total: 1796104 entries, volume: 17866508365925 bytes (16.25 TB), space
>>>> used: 14704924899840 bytes (13.37 TB)
>>>>
>>>> Any ideas what is wrong here?
>>>>
>>>> —
>>>> Dan Szkola
>>>> FNAL
>>>> _______________________________________________
>>>> lustre-discuss mailing list
>>>> lustre-discuss at lists.lustre.org
>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org&d=DwIGaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=e9DXjyTaQ786Tg7WH7oIVaQOA1YDRqyxHOUaYU2_LQw&m=Nk1MkSBTpT-KnrXzEvOOP5tZoVAKyHfPvB-o8_OhewuwHF6S0KelH_WPMLq8IRnR&s=JzAV0C2_CqaDUOG0wZr0mx5tiblBde6ZRUuIHZ2n9DI&e=
>>>
>>> _______________________________________________
>>> lustre-discuss mailing list
>>> lustre-discuss at lists.lustre.org
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org&d=DwIDaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=e9DXjyTaQ786Tg7WH7oIVaQOA1YDRqyxHOUaYU2_LQw&m=BBVt50ufoqbL64MfSKVa87fK1B4Q0n91KVNJVmvb-9q9xOYwnzpZcOXWgUeM6fxQ&s=uTJ98MgxxcM61HIDJRBpfJpuLDt9Ug4ARh8P_Api3xQ&e=
>>>
>
>
More information about the lustre-discuss
mailing list