[Lustre-devel] Lustre dcache clean up and distributed locking

Yong Fan yong.fan at whamcloud.com
Sat Feb 11 07:58:42 PST 2012


Hi Tao,

The new design looks OK me.
BTW, you have removed the section about "ll_lookup_lock" from the first
version document. So you will not change related logic, right? I think this
part should be fixed in new kernels.

Cheers,
Nasf

On Thu, Feb 9, 2012 at 3:15 PM, <tao.peng at emc.com> wrote:

>
> > -----Original Message-----
> > From: Fan Yong [mailto:yong.fan at whamcloud.com]
> > Sent: Saturday, February 04, 2012 12:18 AM
> > To: Peng, Tao
> > Cc: adilger at whamcloud.com; bergwolf at gmail.com; Tang, Haiying; faibish,
> sorin; Liu, Xuezhao;
> > laisiyao at whamcloud.com; green at whamcloud.com; eeb at whamcloud.com;
> bryon at whamcloud.com; lustre-
> > devel at lists.lustre.org
> > Subject: Re: Lustre dcache clean up and distributed locking
> >
> > On 2/3/12 3:37 PM, tao.peng at emc.com wrote:
> > >> -----Original Message-----
> > >> From: Fan Yong [mailto:yong.fan at whamcloud.com]
> > >> Sent: Tuesday, January 31, 2012 8:02 PM
> > >> To: Peng, Tao
> > >> Cc: Andreas Dilger; bergwolf at gmail.com; Tang, Haiying; faibish,
> sorin; Liu, Xuezhao;
> > >> laisiyao at whamcloud.com; green at whamcloud.com; eeb at whamcloud.com;
> bryon at whamcloud.com; lustre-
> > >> devel at lists.lustre.org
> > >> Subject: Re: Lustre dcache clean up and distributed locking
> > >>
> > >> On 1/31/12 7:27 AM, Andreas Dilger wrote:
> > >>> On 2012-01-30, at 12:46 AM,<tao.peng at emc.com>   wrote:
> > >>>> Sorry for the late reply. I got your email earlier but didn't have
> chance to read the document
> > due
> > >> to very limited internet access.
> > >>>> Thank you very much for taking detailed look at the document and
> giving so many valuable comments.
> > >>>>
> > >>>> About adding new inode bitlock, I must admit that I missed hardlink
> case. Although the lock mode
> > >> can be changed to be the same as LOOKUP lock, I agree that inode
> bitlock cannot sufficiently
> > protect
> > >> every name/dentry separately for hardlinked files. And I agree, as
> you suggested in the document,
> > >> adding a name entry based lock can solve it.
> > >>>> IIUC, to add a name entry based lock, both client and server should
> be modified, and LDLM need to
> > >> be changed to manage the new lock information. Like new inode
> bitlock, the name entry based lock
> > will
> > >> as well introduce interoperability issues as a result of on wire
> protocol change. And in DNE
> > >> environment as Andreas explained, the original plan is to get LOOKUP
> on both master MDT and remote
> > MDT.
> > >> If we introduce name entry based lock, DNE case may need some change
> as well.
> > >>>> I will take a closer look at the approach. In the meanwhile, could
> you please confirm if it is
> > the
> > >> right way to go? The benefit of above complexity seems to be better
> distributed dentry locking
> > under
> > >> dcache framework.
> > >>>> What do others think? Andreas?
> > >>> Tao, before we go down the road of implementing such a change, there
> would have to be clear
> > benefits
> > >> to the performance and/or kernel integration before we could accept
> the interoperability issues and
> > >> added code complexity.
> > >>> The benefits of having a per-name lock, as I see them, are:
> > >>> - can be cancelled separately from other name locks
> > >>> - can be changed independently from the inode permissions (LOOKUP)
> > >>> - potential ease of integration with the Linux dcache
> > >>>
> > >>> The drawbacks of introducing a per-name lock:
> > >>> - only useful for hard-linked files (uncommon case)
> > >>> - added code complexity for a new lock type
> > >>> - incompatibility with existing client/server
> > >>> - additional code to handle incompatibility transparently
> > >>> - lock enqueue may need extra RPC for each name lock (different
> resource)
> > >>> - lock cancellation may need an extra RPC for each resource
> > >>>
> > >>> The way I see it is that the DENTRY lock would need to be a
> different resource for each name on a
> > >> file.  The only way this is going to be useful AFAICS is if the
> client enqueues each DENTRY lock on
> > a
> > >> separate resource, so that it can be granted and cancelled
> independently of the LOOKUP lock (which
> > is
> > >> on the FID and would be common for all names that reference this
> inode).  That would either need an
> > >> extra RPC for every filename lookup (to enqueue both the DENTRY and
> LOOKUP locks) or a new
> > >> LDLM_ENQUEUE RPC that can request and be granted two different locks
> (FID [seq, oid, ver, 0], and
> > >> FID+name hash [seq, oid, ver, hash]) in a single RPC.
> > >>> So, the main question is how much better the client dcache/VFS
> integration will be with the DCACHE
> > >> lock compared to the existing mechanism with the LOOKUP (FID) lock?
>  Is it impossible (or very
> > >> difficult) to use the existing mechanism for the new lockless dcache
> code?  Or is it a matter of
> > this
> > >> being a somewhat cleaner implementation, but not critical for moving
> forward?
> > >>> I'm aware that our dcache code has been too complex for some time,
> but I haven't looked recently
> > to
> > >> see how much of this is due to using/supporting old kernel APIs, and
> how much of it is due to the
> > use
> > >> of the LOOKUP lock on the FID protecting too much of the state on the
> client.
> > >>> Cheers, Andreas
> > >> Hi Tao,
> > >>
> > >> To introduce either per-name lock or DENTRY lock is mainly for
> removing
> > >> Lustre special flag "DCACHE_LUSTRE_INVALID" to simply Lustre code. But
> > >> if such simplification causes more complex ldlm logic, we have to
> > >> consider whether it is worth to remove "DCACHE_LUSTRE_INVALID". It is
> > >> not impossible to make "DCACHE_LUSTRE_INVALID" accepted by linux
> kernel.
> > >> On the other hand, "DCACHE_LUSTRE_INVALID" can work together with
> > >> lockless dcache in new linux kernel also.
> > >>
> > >> So to push the project move forward, we can divide it as two parts
> > >> (according to your document): Lustre dcache code cleanup, and some new
> > >> ldlm logic to remove "DCACHE_LUSTRE_INVALID". Before we reach an
> > >> agreement on the second part, we can start from the first part to
> > >> cleanup Lustre dcache code. How do you think?
> > >>
> > > Thank you, Andreas and Yong. I tend to agree with you that we should
> avoid the complexity if there
> > is no obvious benefits. However, as we need to make sure the code is
> acceptable by kernel community, I
> > will try to consolidate a design for adding the new type of lock but
> just don't implement it at this
> > point. And if later we are asked (by kernel community developers) to do
> it, we just need to do the
> > implementation.
> >
> > Hi Tao,
> >
> > We has another way to remove Lustre special dentry flags
> > "DCACHE_LUSTRE_INVALID". To indicate the dentry in dcache without LOOKUP
> > lock protected but not unhashed yet, we can add new flags in
> > "ll_dentry_data" (pointed by "dentry::d_fsdata"). Since it is Lustre
> > private date, no need to worry about whether the new flags will conflict
> > with any other VFS standard "dentry::d_flags" or not, and maybe save
> > some effort to implement the complex DENTRY lock in your original design.
> >
> Hi Fan Yong,
>
> Thanks for your great suggestion. I updated the design doc with following
> changes, please help to review. Thank you!
>
> Changelog:
> 1. Macro cleanups are removed. I prefer to do it separately while cleaning
> up other obsolete macros.
> 2. Remove DENTRY LDLM lock proposal. Because DCACHE_LUSTRE_INVALID flag
> can be hidden from generic VFS layer, it is no longer necessary to
> introduce such complex change.
> 3. Add a section to describe how to add LLD_LUSTRE_INVALID flag to
> ll_dentry_data structure to replace existing DCACHE_LUSTRE_INVALID flag.
>
> Cheers,
> Tao
> > Cheers,
> > Nasf
> >
> > > Please stay tuned.
> > >
> > > Best,
> > > Tao
> > >
> > >> Cheers,
> > >> --
> > >> Nasf
> > >>
> > >>>
> > >>>>> -----Original Message-----
> > >>>>> From: Fan Yong [mailto:yong.fan at whamcloud.com]
> > >>>>> Sent: Saturday, January 21, 2012 1:11 AM
> > >>>>> To: Peng, Tao; bergwolf at gmail.com
> > >>>>> Cc: Tang, Haiying; adilger at whamcloud.com; faibish, sorin; Liu,
> Xuezhao; laisiyao at whamcloud.com;
> > >>>>> green at whamcloud.com; eeb at whamcloud.com; Bryon Neitzel
> > >>>>> Subject: Re: Lustre dcache clean up and distributed locking
> > >>>>>
> > >>>>> Hi Tao,
> > >>>>>
> > >>>>> Excellent work. I just added some comments inside your document.
> Please check.
> > >>>>>
> > >>>>>
> > >>>>> Best Regards,
> > >>>>> Fan Yong
> > >>>>> Whamcloud, Inc.
> > >>>>>
> > >>>>> On 1/20/12 9:34 AM, haiying.tang at emc.com wrote:
> > >>>>>> Hi Andreas,
> > >>>>>>
> > >>>>>> Tao is on vacation now. It might be difficult for him to check
> emails due to limited internet
> > >> access
> > >>>>> at hometown.
> > >>>>>> For urgent issue, you folks could send email to his gmail account
> bergwolf at gmail.com.
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>> Haiying
> > >>>>>>
> > >>>>>> -----Original Message-----
> > >>>>>> From: Andreas Dilger [mailto:adilger at whamcloud.com]
> > >>>>>> Sent: 2012年1月20日 1:39
> > >>>>>> To: Peng, Tao
> > >>>>>> Cc: faibish, sorin; Tang, Haiying; Liu, Xuezhao;
> > >>>>>> laisiyao at whamcloud.com; yong.fan at whamcloud.com;
> green at whamcloud.com;
> > >>>>>> eeb at whamcloud.com
> > >>>>>> Subject: Re: Lustre dcache clean up and distributed locking
> > >>>>>>
> > >>>>>> On 2012-01-17, at 3:21 AM,<tao.peng at emc.com>   <tao.peng at emc.com>
>   wrote:
> > >>>>>>> Thanks Siyao and Oleg for answering my dcache revalidation
> question on lustre mailing list. I
> > >>>>> updated the design to reflect it.
> > >>>>>>> Please see attachment.
> > >>>>>> Tao,
> > >>>>>> Fan Yong is also taking a more detailed look at your document and
> will
> > >>>>>> hopefully have a chance to reply before the New Year holidays.
> > >>>>>>
> > >>>>>> Also, we are just working on landing the patches to add support
> for
> > >>>>>> Linux
> > >>>>>> 2.6.38 for the Lustre client.  One of the patches relates to the
> > >>>>>> lockless dcache changes that were introduced in that kernel.  If
> you
> > >>>>>> are interested to review this patch, and become more familiar
> with the
> > >>>>>> Lustre development process, you should visit
> http://review.whamcloud.com/1865 for the patch.
> > >>>>>>
> > >>>>>> You need to create an account in Gerrit using OpenID (Google,
> mostly),
> > >>>>>> and an account in our bug tracking system (
> http://jira.whamcloud.com)
> > >>>>>> if you haven't already.
> > >>>>>>
> > >>>>>>>> -----Original Message-----
> > >>>>>>>> From: Andreas Dilger [mailto:adilger at whamcloud.com]
> > >>>>>>>> Sent: Tuesday, January 17, 2012 4:16 PM
> > >>>>>>>> To: Peng, Tao
> > >>>>>>>> Cc: faibish, sorin; Tang, Haiying; Liu, Xuezhao; Lai Siyao; Fan
> > >>>>>>>> Yong; Oleg Drokin; Eric Barton
> > >>>>>>>> Subject: Re: Lustre dcache clean up and distributed locking
> > >>>>>>>>
> > >>>>>>>> On 2012-01-16, at 9:25 PM,<tao.peng at emc.com>   wrote:
> > >>>>>>>>> I finally started to work on Lustre dcache cleanup and locking.
> > >>>>>>>>> After reading Lustre, ocfs2 and VFS
> > >>>>>>>> dcache related code, I came to a design for cleaning up Lustre
> > >>>>>>>> dcache code and doing distributed locking under dcache. For
> > >>>>>>>> distributed locking, the main idea is to add a new inode bitlock
> > >>>>>>>> DENTRY lock to just protect valid dentry, instead of letting
> LOOKUP
> > >>>>>>>> lock handle multiple purpose, which makes client unable to know
> > >>>>>>>> whether file is deleted or not when server cancels LOOKUP lock.
> Instead, client holds PR mode
> > >>>>> DENTRY lock on all valid denties and when server cancels it,
> client knows the file is deleted.
> > >>>>>>>>> For details, please see the attachments (I attached both word
> and
> > >>>>>>>>> pdf versions as I am not sure if
> > >>>>>>>> which is more convenient for you:). And please help to review
> and comment. Thank you!
> > >>>>>>>>
> > >>>>>>>> Hi Tao,
> > >>>>>>>> I'm passing this on to the engineers that are most familiar
> with the
> > >>>>>>>> DLM and client dcache code in Lustre.  After a quick first read
> > >>>>>>>> through your document, your investigation of the client code is
> very
> > >>>>>>>> insightful and looks like it will be able to remove some of the
> significant complexity that
> > has
> > >>>>> grown over time in the llite code.
> > >>>>>>>> Cheers, Andreas
> > >>>>>>>> --
> > >>>>>>>> Andreas Dilger                       Whamcloud, Inc.
> > >>>>>>>> Principal Engineer                   http://www.whamcloud.com/
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>> <dcache_distributed_locking-v2.docx>
> > >>>>>> Cheers, Andreas
> > >>>>>> --
> > >>>>>> Andreas Dilger                       Whamcloud, Inc.
> > >>>>>> Principal Engineer                   http://www.whamcloud.com/
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>> Cheers, Andreas
> > >>> --
> > >>> Andreas Dilger                       Whamcloud, Inc.
> > >>> Principal Engineer                   http://www.whamcloud.com/
> > >>>
> > >>>
> > >>>
> > >>>
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20120211/645ae651/attachment.htm>


More information about the lustre-devel mailing list