[Lustre-devel] Lustre dcache clean up and distributed locking

Andreas Dilger adilger at whamcloud.com
Sun Jan 22 09:33:42 PST 2012

On 2012-01-22, at 9:11, Fan Yong <yong.fan at whamcloud.com> wrote:

> On 1/21/12 2:06 AM, Andreas Dilger wrote:
>> Fan Yong,
>> About DNE locking - in order to handle the split name/permission issue, the current plan is for the client to get the LOOKUP lock for the FID on both the master MDT and the remote MDT.
> Hi Andreas,
> Why two LOOKUP locks on two MDTs, why not one LOOKUP lock on inode holder MDT and the other name lock (new) on name entry holder MDT? Any special advantage? I think both methods need client-side support (for protocol changes).

The difference between the two approaches is relatively small in terms of the functional outcome, but in terms of implementation there is a bigger difference. 

Two reasons for this:
- we already have LOOKUP locks, so less code changes are needed
- having a single lock per FID is easier to manage compared to one lock per name

Consider that chmod/chgrp/chown/ACL operations are more common in practice than hard links, then having a single LOOKUP lock is more efficient. 
Even with hard links, if the client does a lookup on a link on the master, it can find the FID from the dirent and may have the inode data cached from the remote MDT. 

I agree that name locks may have advantages in some cases, but I don't know if that is worth the extra complexity. 

Cheers, Andreas

>> If the link is removed from the master MDT, then the FID LOOKUP lock will be cancelled there, but if another hard link is remove from a different MDT then the only the LOOKUP from the other MDT needs to be cancelled.
>> This will be similar to Tao's proposal for a separate lock bit for the name, in the DNE case where the name is remote from the inode. It still suffers from some inefficiency in case of multiple hard links from the same master MDT to a remote inode (canceling the FID LOOKUP lock when unlinking one name will force it to be refetched for the other links), but since hard links are so rare this should not significantly impact performance.
>> Cheers, Andreas
>> On 2012-01-20, at 10:10, Fan Yong<yong.fan at whamcloud.com>  wrote:
>>> Excellent work. I just added some comments inside your document. Please
>>> check.
>>> Best Regards,
>>> Fan Yong
>>> Whamcloud, Inc.
>>> On 1/20/12 9:34 AM, haiying.tang at emc.com wrote:
>>>> Hi Andreas,
>>>> Tao is on vacation now. It might be difficult for him to check emails due to limited internet access at hometown.
>>>> For urgent issue, you folks could send email to his gmail account bergwolf at gmail.com.
>>>> Thanks,
>>>> Haiying
>>>> -----Original Message-----
>>>> From: Andreas Dilger [mailto:adilger at whamcloud.com]
>>>> Sent: 2012年1月20日 1:39
>>>> To: Peng, Tao
>>>> Cc: faibish, sorin; Tang, Haiying; Liu, Xuezhao; laisiyao at whamcloud.com; yong.fan at whamcloud.com; green at whamcloud.com; eeb at whamcloud.com
>>>> Subject: Re: Lustre dcache clean up and distributed locking
>>>> On 2012-01-17, at 3:21 AM,<tao.peng at emc.com>  <tao.peng at emc.com>  wrote:
>>>>> Thanks Siyao and Oleg for answering my dcache revalidation question on lustre mailing list. I updated the design to reflect it.
>>>>> Please see attachment.
>>>> Tao,
>>>> Fan Yong is also taking a more detailed look at your document and will
>>>> hopefully have a chance to reply before the New Year holidays.
>>>> Also, we are just working on landing the patches to add support for Linux
>>>> 2.6.38 for the Lustre client.  One of the patches relates to the lockless
>>>> dcache changes that were introduced in that kernel.  If you are interested
>>>> to review this patch, and become more familiar with the Lustre development
>>>> process, you should visit http://review.whamcloud.com/1865 for the patch.
>>>> You need to create an account in Gerrit using OpenID (Google, mostly), and
>>>> an account in our bug tracking system (http://jira.whamcloud.com) if you
>>>> haven't already.
>>>>>> -----Original Message-----
>>>>>> From: Andreas Dilger [mailto:adilger at whamcloud.com]
>>>>>> Sent: Tuesday, January 17, 2012 4:16 PM
>>>>>> To: Peng, Tao
>>>>>> Cc: faibish, sorin; Tang, Haiying; Liu, Xuezhao; Lai Siyao; Fan Yong; Oleg Drokin; Eric Barton
>>>>>> Subject: Re: Lustre dcache clean up and distributed locking
>>>>>> On 2012-01-16, at 9:25 PM,<tao.peng at emc.com>  wrote:
>>>>>>> I finally started to work on Lustre dcache cleanup and locking. After reading Lustre, ocfs2 and VFS
>>>>>> dcache related code, I came to a design for cleaning up Lustre dcache code and doing distributed
>>>>>> locking under dcache. For distributed locking, the main idea is to add a new inode bitlock DENTRY lock
>>>>>> to just protect valid dentry, instead of letting LOOKUP lock handle multiple purpose, which makes
>>>>>> client unable to know whether file is deleted or not when server cancels LOOKUP lock. Instead, client
>>>>>> holds PR mode DENTRY lock on all valid denties and when server cancels it, client knows the file is
>>>>>> deleted.
>>>>>>> For details, please see the attachments (I attached both word and pdf versions as I am not sure if
>>>>>> which is more convenient for you:). And please help to review and comment. Thank you!
>>>>>> Hi Tao,
>>>>>> I'm passing this on to the engineers that are most familiar with the DLM and client dcache code in
>>>>>> Lustre.  After a quick first read through your document, your investigation of the client code is very
>>>>>> insightful and looks like it will be able to remove some of the significant complexity that has grown
>>>>>> over time in the llite code.
>>>>>> Cheers, Andreas
>>>>>> --
>>>>>> Andreas Dilger                       Whamcloud, Inc.
>>>>>> Principal Engineer                   http://www.whamcloud.com/
>>>>> <dcache_distributed_locking-v2.docx>
>>>> Cheers, Andreas
>>>> --
>>>> Andreas Dilger                       Whamcloud, Inc.
>>>> Principal Engineer                   http://www.whamcloud.com/
>>> <dcache_distributed_locking-v2_comment.docx>

More information about the lustre-devel mailing list