[Lustre-devel] Lustre dcache clean up and distributed locking

tao.peng at emc.com tao.peng at emc.com
Wed Feb 8 23:15:55 PST 2012


> -----Original Message-----
> From: Fan Yong [mailto:yong.fan at whamcloud.com]
> Sent: Saturday, February 04, 2012 12:18 AM
> To: Peng, Tao
> Cc: adilger at whamcloud.com; bergwolf at gmail.com; Tang, Haiying; faibish, sorin; Liu, Xuezhao;
> laisiyao at whamcloud.com; green at whamcloud.com; eeb at whamcloud.com; bryon at whamcloud.com; lustre-
> devel at lists.lustre.org
> Subject: Re: Lustre dcache clean up and distributed locking
> 
> On 2/3/12 3:37 PM, tao.peng at emc.com wrote:
> >> -----Original Message-----
> >> From: Fan Yong [mailto:yong.fan at whamcloud.com]
> >> Sent: Tuesday, January 31, 2012 8:02 PM
> >> To: Peng, Tao
> >> Cc: Andreas Dilger; bergwolf at gmail.com; Tang, Haiying; faibish, sorin; Liu, Xuezhao;
> >> laisiyao at whamcloud.com; green at whamcloud.com; eeb at whamcloud.com; bryon at whamcloud.com; lustre-
> >> devel at lists.lustre.org
> >> Subject: Re: Lustre dcache clean up and distributed locking
> >>
> >> On 1/31/12 7:27 AM, Andreas Dilger wrote:
> >>> On 2012-01-30, at 12:46 AM,<tao.peng at emc.com>   wrote:
> >>>> Sorry for the late reply. I got your email earlier but didn't have chance to read the document
> due
> >> to very limited internet access.
> >>>> Thank you very much for taking detailed look at the document and giving so many valuable comments.
> >>>>
> >>>> About adding new inode bitlock, I must admit that I missed hardlink case. Although the lock mode
> >> can be changed to be the same as LOOKUP lock, I agree that inode bitlock cannot sufficiently
> protect
> >> every name/dentry separately for hardlinked files. And I agree, as you suggested in the document,
> >> adding a name entry based lock can solve it.
> >>>> IIUC, to add a name entry based lock, both client and server should be modified, and LDLM need to
> >> be changed to manage the new lock information. Like new inode bitlock, the name entry based lock
> will
> >> as well introduce interoperability issues as a result of on wire protocol change. And in DNE
> >> environment as Andreas explained, the original plan is to get LOOKUP on both master MDT and remote
> MDT.
> >> If we introduce name entry based lock, DNE case may need some change as well.
> >>>> I will take a closer look at the approach. In the meanwhile, could you please confirm if it is
> the
> >> right way to go? The benefit of above complexity seems to be better distributed dentry locking
> under
> >> dcache framework.
> >>>> What do others think? Andreas?
> >>> Tao, before we go down the road of implementing such a change, there would have to be clear
> benefits
> >> to the performance and/or kernel integration before we could accept the interoperability issues and
> >> added code complexity.
> >>> The benefits of having a per-name lock, as I see them, are:
> >>> - can be cancelled separately from other name locks
> >>> - can be changed independently from the inode permissions (LOOKUP)
> >>> - potential ease of integration with the Linux dcache
> >>>
> >>> The drawbacks of introducing a per-name lock:
> >>> - only useful for hard-linked files (uncommon case)
> >>> - added code complexity for a new lock type
> >>> - incompatibility with existing client/server
> >>> - additional code to handle incompatibility transparently
> >>> - lock enqueue may need extra RPC for each name lock (different resource)
> >>> - lock cancellation may need an extra RPC for each resource
> >>>
> >>> The way I see it is that the DENTRY lock would need to be a different resource for each name on a
> >> file.  The only way this is going to be useful AFAICS is if the client enqueues each DENTRY lock on
> a
> >> separate resource, so that it can be granted and cancelled independently of the LOOKUP lock (which
> is
> >> on the FID and would be common for all names that reference this inode).  That would either need an
> >> extra RPC for every filename lookup (to enqueue both the DENTRY and LOOKUP locks) or a new
> >> LDLM_ENQUEUE RPC that can request and be granted two different locks (FID [seq, oid, ver, 0], and
> >> FID+name hash [seq, oid, ver, hash]) in a single RPC.
> >>> So, the main question is how much better the client dcache/VFS integration will be with the DCACHE
> >> lock compared to the existing mechanism with the LOOKUP (FID) lock?  Is it impossible (or very
> >> difficult) to use the existing mechanism for the new lockless dcache code?  Or is it a matter of
> this
> >> being a somewhat cleaner implementation, but not critical for moving forward?
> >>> I'm aware that our dcache code has been too complex for some time, but I haven't looked recently
> to
> >> see how much of this is due to using/supporting old kernel APIs, and how much of it is due to the
> use
> >> of the LOOKUP lock on the FID protecting too much of the state on the client.
> >>> Cheers, Andreas
> >> Hi Tao,
> >>
> >> To introduce either per-name lock or DENTRY lock is mainly for removing
> >> Lustre special flag "DCACHE_LUSTRE_INVALID" to simply Lustre code. But
> >> if such simplification causes more complex ldlm logic, we have to
> >> consider whether it is worth to remove "DCACHE_LUSTRE_INVALID". It is
> >> not impossible to make "DCACHE_LUSTRE_INVALID" accepted by linux kernel.
> >> On the other hand, "DCACHE_LUSTRE_INVALID" can work together with
> >> lockless dcache in new linux kernel also.
> >>
> >> So to push the project move forward, we can divide it as two parts
> >> (according to your document): Lustre dcache code cleanup, and some new
> >> ldlm logic to remove "DCACHE_LUSTRE_INVALID". Before we reach an
> >> agreement on the second part, we can start from the first part to
> >> cleanup Lustre dcache code. How do you think?
> >>
> > Thank you, Andreas and Yong. I tend to agree with you that we should avoid the complexity if there
> is no obvious benefits. However, as we need to make sure the code is acceptable by kernel community, I
> will try to consolidate a design for adding the new type of lock but just don't implement it at this
> point. And if later we are asked (by kernel community developers) to do it, we just need to do the
> implementation.
> 
> Hi Tao,
> 
> We has another way to remove Lustre special dentry flags
> "DCACHE_LUSTRE_INVALID". To indicate the dentry in dcache without LOOKUP
> lock protected but not unhashed yet, we can add new flags in
> "ll_dentry_data" (pointed by "dentry::d_fsdata"). Since it is Lustre
> private date, no need to worry about whether the new flags will conflict
> with any other VFS standard "dentry::d_flags" or not, and maybe save
> some effort to implement the complex DENTRY lock in your original design.
> 
Hi Fan Yong,

Thanks for your great suggestion. I updated the design doc with following changes, please help to review. Thank you!

Changelog:
1. Macro cleanups are removed. I prefer to do it separately while cleaning up other obsolete macros.
2. Remove DENTRY LDLM lock proposal. Because DCACHE_LUSTRE_INVALID flag can be hidden from generic VFS layer, it is no longer necessary to introduce such complex change.
3. Add a section to describe how to add LLD_LUSTRE_INVALID flag to ll_dentry_data structure to replace existing DCACHE_LUSTRE_INVALID flag.

Cheers,
Tao
> Cheers,
> Nasf
> 
> > Please stay tuned.
> >
> > Best,
> > Tao
> >
> >> Cheers,
> >> --
> >> Nasf
> >>
> >>>
> >>>>> -----Original Message-----
> >>>>> From: Fan Yong [mailto:yong.fan at whamcloud.com]
> >>>>> Sent: Saturday, January 21, 2012 1:11 AM
> >>>>> To: Peng, Tao; bergwolf at gmail.com
> >>>>> Cc: Tang, Haiying; adilger at whamcloud.com; faibish, sorin; Liu, Xuezhao; laisiyao at whamcloud.com;
> >>>>> green at whamcloud.com; eeb at whamcloud.com; Bryon Neitzel
> >>>>> Subject: Re: Lustre dcache clean up and distributed locking
> >>>>>
> >>>>> Hi Tao,
> >>>>>
> >>>>> Excellent work. I just added some comments inside your document. Please check.
> >>>>>
> >>>>>
> >>>>> Best Regards,
> >>>>> Fan Yong
> >>>>> Whamcloud, Inc.
> >>>>>
> >>>>> On 1/20/12 9:34 AM, haiying.tang at emc.com wrote:
> >>>>>> Hi Andreas,
> >>>>>>
> >>>>>> Tao is on vacation now. It might be difficult for him to check emails due to limited internet
> >> access
> >>>>> at hometown.
> >>>>>> For urgent issue, you folks could send email to his gmail account bergwolf at gmail.com.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Haiying
> >>>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Andreas Dilger [mailto:adilger at whamcloud.com]
> >>>>>> Sent: 2012年1月20日 1:39
> >>>>>> To: Peng, Tao
> >>>>>> Cc: faibish, sorin; Tang, Haiying; Liu, Xuezhao;
> >>>>>> laisiyao at whamcloud.com; yong.fan at whamcloud.com; green at whamcloud.com;
> >>>>>> eeb at whamcloud.com
> >>>>>> Subject: Re: Lustre dcache clean up and distributed locking
> >>>>>>
> >>>>>> On 2012-01-17, at 3:21 AM,<tao.peng at emc.com>   <tao.peng at emc.com>   wrote:
> >>>>>>> Thanks Siyao and Oleg for answering my dcache revalidation question on lustre mailing list. I
> >>>>> updated the design to reflect it.
> >>>>>>> Please see attachment.
> >>>>>> Tao,
> >>>>>> Fan Yong is also taking a more detailed look at your document and will
> >>>>>> hopefully have a chance to reply before the New Year holidays.
> >>>>>>
> >>>>>> Also, we are just working on landing the patches to add support for
> >>>>>> Linux
> >>>>>> 2.6.38 for the Lustre client.  One of the patches relates to the
> >>>>>> lockless dcache changes that were introduced in that kernel.  If you
> >>>>>> are interested to review this patch, and become more familiar with the
> >>>>>> Lustre development process, you should visit http://review.whamcloud.com/1865 for the patch.
> >>>>>>
> >>>>>> You need to create an account in Gerrit using OpenID (Google, mostly),
> >>>>>> and an account in our bug tracking system (http://jira.whamcloud.com)
> >>>>>> if you haven't already.
> >>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: Andreas Dilger [mailto:adilger at whamcloud.com]
> >>>>>>>> Sent: Tuesday, January 17, 2012 4:16 PM
> >>>>>>>> To: Peng, Tao
> >>>>>>>> Cc: faibish, sorin; Tang, Haiying; Liu, Xuezhao; Lai Siyao; Fan
> >>>>>>>> Yong; Oleg Drokin; Eric Barton
> >>>>>>>> Subject: Re: Lustre dcache clean up and distributed locking
> >>>>>>>>
> >>>>>>>> On 2012-01-16, at 9:25 PM,<tao.peng at emc.com>   wrote:
> >>>>>>>>> I finally started to work on Lustre dcache cleanup and locking.
> >>>>>>>>> After reading Lustre, ocfs2 and VFS
> >>>>>>>> dcache related code, I came to a design for cleaning up Lustre
> >>>>>>>> dcache code and doing distributed locking under dcache. For
> >>>>>>>> distributed locking, the main idea is to add a new inode bitlock
> >>>>>>>> DENTRY lock to just protect valid dentry, instead of letting LOOKUP
> >>>>>>>> lock handle multiple purpose, which makes client unable to know
> >>>>>>>> whether file is deleted or not when server cancels LOOKUP lock. Instead, client holds PR mode
> >>>>> DENTRY lock on all valid denties and when server cancels it, client knows the file is deleted.
> >>>>>>>>> For details, please see the attachments (I attached both word and
> >>>>>>>>> pdf versions as I am not sure if
> >>>>>>>> which is more convenient for you:). And please help to review and comment. Thank you!
> >>>>>>>>
> >>>>>>>> Hi Tao,
> >>>>>>>> I'm passing this on to the engineers that are most familiar with the
> >>>>>>>> DLM and client dcache code in Lustre.  After a quick first read
> >>>>>>>> through your document, your investigation of the client code is very
> >>>>>>>> insightful and looks like it will be able to remove some of the significant complexity that
> has
> >>>>> grown over time in the llite code.
> >>>>>>>> Cheers, Andreas
> >>>>>>>> --
> >>>>>>>> Andreas Dilger                       Whamcloud, Inc.
> >>>>>>>> Principal Engineer                   http://www.whamcloud.com/
> >>>>>>>>
> >>>>>>>>
> >>>>>>> <dcache_distributed_locking-v2.docx>
> >>>>>> Cheers, Andreas
> >>>>>> --
> >>>>>> Andreas Dilger                       Whamcloud, Inc.
> >>>>>> Principal Engineer                   http://www.whamcloud.com/
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>> Cheers, Andreas
> >>> --
> >>> Andreas Dilger                       Whamcloud, Inc.
> >>> Principal Engineer                   http://www.whamcloud.com/
> >>>
> >>>
> >>>
> >>>
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: dcache_cleanup_v3.docx
Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Size: 24186 bytes
Desc: dcache_cleanup_v3.docx
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20120209/639046f5/attachment.bin>


More information about the lustre-devel mailing list