[Lustre-devel] Moving forward on Quotas
Matthew Ahrens
Matthew.Ahrens at sun.com
Wed Jun 4 16:50:54 PDT 2008
Nikita Danilov wrote:
> Matthew Ahrens writes:
> > Nikita Danilov wrote:
> > > Jeff Bonwick writes:
> > > > I'd suggest working with Matt Ahrens on this.
> > >
> > > Hello,
> > >
> > > we were discussing recently what is needed from the DMU to implement quotas
> > > and other forms of space accounting. Our basic premise is that it is desirable
> > > to keep DMU part of the quota support at minimum, and to implement only
> > > mechanism here, leaving policy to the upper layers.
> >
> > I agree with this premise. However, your proposed implementation (especially
> > the asynchronous update mechanism and associated pending file) seems
> > unnecessarily complicated.
> >
> > I would suggest that we simply update a "database" (eg. ZAP object or sparse
> > array) of userid -> space usage from syncing context when the space is
> > allocated/freed (ie, dsl_dataset_block_{born,kill}). I believe that the
> > problems this presents[*] will be more tractable than the method you outlined.
>
> Indeed, this solution is much simpler, and it was considered
> initially. I see following drawbacks in it:
Agreed, those are possible drawbacks, depending on the implementation. For
example, if the DB object is stored in the user's objset (which is preferable
for other reasons) then I suspect that the two drawbacks you mention below
will be no worse than in your proposal.
--matt
> - a notion of a user identifier (or some opaque identifier) has to
> be introduced in DMU interface. DMU doesn't interpret these
> identifiers in any way, except for using them as keys in a space
> usage database. A set of these identifiers has to be passed to
> every DMU entry point that might result in space allocation (a
> set is needed because there are group quotas, and to keep
> interface more or less generic).
>
> - an implementation of chown, chgrp, and distributed quota require
> DMU user to modify this database. Also, an interface to iterate
> over this database is most likely needed for things like
> distributed fsck, and user level quote reporting tools. It seems
> that it would be quite difficult to encapsulate such a database
> within DMU.
>
> >
> > --matt
> >
> > [*] eg, if the DB object is stored in the user's objset, then updating it in
> > syncing context may be problematic. if it is stored in the MOS, carrying it
>
> The proposal was to update the database in the context of currently open
> transaction group. That is, when transaction group T has just committed,
> commit call-back is invoked and the database is updated in the context
> of some transaction belonging to transaction group T + 2 (T + 1 being in
> sync). It is because of this that pending file has to keep track of
> objects from two last transaction groups.
>
> > along when doing snapshot operations will be painful (snapshot, clone, send,
> > recv, rollback, etc).
>
> Nikita.
More information about the lustre-devel
mailing list