[Lustre-discuss] Client directory entry caching

Tue Aug 3 09:49:10 PDT 2010

Oleg,

On Tue, Aug 3, 2010 at 5:21 AM, Oleg Drokin <oleg.drokin at oracle.com> wrote:
>> So even with the metadata going over NFS the opencache in the client
>> seems to make quite a difference (I'm not sure how much the NFS client
>> caches though). As expected I see no mdt activity for the NFS export
>> once cached. I think it would be really nice to be able to enable the
>> opencache on any lustre client. A couple of potential workloads that I
>
> A simple workaround for you to enable opencache on a specific client would
> be to add cr_flags |= MDS_OPEN_LOCK; in mdc/mdc_lib.c:mds_pack_open_flags()

Yea that works - cheers. FYI some comparisons with a simple find on a
remote client (~33,000 files):

  find /mnt/lustre (not cached) = 41 secs
  find /mnt/lustre (cached) = 19 secs
  find /mnt/lustre (opencache) = 3 secs

The "ls -lR" case is still having to query the MDS a lot (for
getxattr) which becomes quite noticeable in the WAN case. Apparently
the 1.8.4 client already addresses this (#15587?). I might try that
patch too...

> I guess we really need to have an option for this, but I am not sure
> if we want it on the client, server, or both.

Doing it client side with the minor modification you suggest is
probably enough for our purposes for the time being. Thanks.

>> can think of that would benefit are WAN clients and clients that need
>> to do mainly metadata (e.g. scanning the filesystem, rsync --link-dest
>> hardlink snapshot backups). For the WAN case I'd be quite interested
>
> Open is very narrow metadata case, so if you do metadata but no opens you would
> get zero benefit from open cache.

I suppose the recursive scan case is a fairly low frequency operation
but is also one that Lustre has always suffered noticeably worse
performance when compared to something simpler like NFS. Slightly off
topic (and I've kinda asked this before) but is there a good reason
why link() speeds in Lustre are so slow compare to something like NFS?
A quick comparison of doing a "cp -al" from a remote Lustre client and
an NFS client (to a fast NFS server):

  cp -fa /mnt/lustre/blah /mnt/lustre/blah2 = ~362 files/sec
  cp -fa /mnt/nfs/blah /mnt/nfs/blah2 = ~1863 files/sec

Is it just the extra depth of the lustre stack/code path? Is there
anything we could do to speed this up if we know that no other client
will touch these dirs while we hardlink them?

> Also getting this extra lock puts some extra cpu load on MDS, but if we go this far,
> we can then somewhat simplify rep-ack and hold it for much shorter time in
> a lot of cases which would greatly help WAN workloads that happen to create
> files in same dir from many nodes, for example. (see bug 20373, first patch)
> Just be aware that testing with more than 16000 clients at ORNL clearly shows
> degradations at LAN latencies.

Understood. I think we are a long way off hitting those kinds of
limits. The WAN case is interesting because it is the interactive
speed of browsing the filesystem that is usually the most noticeable
(and annoying) artefact of being many miles away from the server. Once
you start accessing the files you want then you are reasonably happy
to be limited by your connection's overall bandwidth.

Thanks for the feedback,

Daire