[lustre-discuss] Quota issue after OST removal

Daniel Szkola dszkola at fnal.gov
Wed Oct 26 13:37:27 PDT 2022


I did show a 'lfs quota -g somegroup' in the original post and yes, each OST
is at the limit that was originally allocated, especially after migrating
the files off of the two OSTs before removal.

However, I think you may be misreading the issue here. The total quota is
27T and all the files on the remaining OSTs adds up to just over 21T because
two OSTs have been removed permanently. The permanently removed OSTs should
not be part of the calculations anymore.

When the two OSTs were removed, shouldn't the quota be split among the
remaining OSTs with each OST given a bigger share of the overall quota? Is
there a way to force this? Will restarting the MDS cause this to happen?

I just changed the soft/hard limits to 37T/40T from 27T/30T and that does
allocate more space per OST, but putting it back to 27T/30T puts the
original values back and the group is again shown as exceeding quota. Why is
setquota still using the removed OSTs? You can see in the listing where it
is still looking for ost4 and ost5.

quotactl ost4 failed.
quotactl ost5 failed.

--
Dan Szkola
FNAL

On Wed, 2022-10-26 at 21:00 +0200, Thomas Roth via lustre-discuss wrote:
> Hi Daniel,
> 
> isn't this expected: on your lustrefs-OST0001, usage  seems to have hit
> the limit (perhaps if you do 'lfs quota -g somegroup...', it will show
> you by how many bytes).
> 
> If one part of the distributed quota is exceeded, Lustre should report
> that with the * - although the total across the file system is still below
> the 
> limit.
> 
> 
> Obviously your 'somegroup' is at the quota limit on all visible OSTs, so
> my guess is that would be the same on the missing two OSTs.
> So, either have some data removed or increase the limit.
> 
> Best regards
> Thomas
> 
> On 26.10.22 16:52, Daniel Szkola via lustre-discuss wrote:
> > Hello all,
> > 
> > We recently removed an OSS/OST node that was spontaneously shutting down
> > so
> > hardware testing could be performed. I have no idea how long it will be
> > out,
> > so I followed the procedure for permanent removal.
> > 
> > Since then space usage is being calculated correctly, but 'lfs quota'
> > will
> > show groups as exceeding quota, despite being under both soft and hard
> > limits. A verbose listing shows that all OST limits are met and I have
> > no
> > idea how to reset the limits now that the two OSTs on the removed OSS
> > node
> > are not part of the equation.
> > 
> > Due to the heavy usage of the Lustre filesystem, no clients have been
> > unmounted and no MDS or OST nodes have been restarted. The underlying
> > filesystem is ZFS.
> > 
> > Looking for ideas on how to correct this.
> > 
> > Example:
> > 
> > # lfs quota -gh somegroup -v /lustre1
> > Disk quotas for grp somegroup (gid NNNN):
> >       Filesystem    used   quota   limit   grace   files   quota   limit
> > grace
> >         /lustre1  21.59T*    27T     30T 6d23h39m15s 2250592  2621440
> > 3145728
> > -
> > lustrefs-MDT0000_UUID
> >                   1.961G       -  1.962G       - 2250592       - 2359296
> > -
> > lustrefs-OST0000_UUID
> >                   2.876T       -  2.876T       -       -       -       -
> > -
> > lustrefs-OST0001_UUID
> >                   2.611T*      -  2.611T       -       -       -       -
> > -
> > lustrefs-OST0002_UUID
> >                   4.794T       -  4.794T       -       -       -       -
> > -
> > lustrefs-OST0003_UUID
> >                   4.587T       -  4.587T       -       -       -       -
> > -
> > quotactl ost4 failed.
> > quotactl ost5 failed.
> > lustrefs-OST0006_UUID
> >                    3.21T       -   3.21T       -       -       -       -
> > -
> > lustrefs-OST0007_UUID
> >                   3.515T       -  3.515T       -       -       -       -
> > -
> > Total allocated inode limit: 2359296, total allocated block limit:
> > 21.59T
> > Some errors happened when getting quota info. Some devices may be not
> > working or deactivated. The data in "[]" is inaccurate.
> > 
> > --
> > Dan Szkola
> > FNAL
> > _______________________________________________
> > lustre-discuss mailing list
> > lustre-discuss at lists.lustre.org
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org&d=DwICAg&c=gRgGjJ3BkIsb5y6s49QqsA&r=e9DXjyTaQ786Tg7WH7oIVaQOA1YDRqyxHOUaYU2_LQw&m=5r-NvoGFgbr_fVjvWQhbhz8QVqYg9ZRVU5EEQik2CBRsJg5sReIlz9B-UX7VHKey&s=ymWPMGZy5p2pXtOBT_qsRHVMNT5OGmAACZpJj9xUq0k&e=
> >  
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org&d=DwICAg&c=gRgGjJ3BkIsb5y6s49QqsA&r=e9DXjyTaQ786Tg7WH7oIVaQOA1YDRqyxHOUaYU2_LQw&m=5r-NvoGFgbr_fVjvWQhbhz8QVqYg9ZRVU5EEQik2CBRsJg5sReIlz9B-UX7VHKey&s=ymWPMGZy5p2pXtOBT_qsRHVMNT5OGmAACZpJj9xUq0k&e=
>  



More information about the lustre-discuss mailing list