[Lustre-discuss] Poor metadata operation performance

Andreas Dilger adilger at whamcloud.com
Fri May 20 09:37:50 PDT 2011


Ken, the OSTs need to track the ownership of objects for quota.  The more stripes there are on a file, the more RPCs that need to be sent, which is why we don't recommend wide striping unless there is a reason for it (bandwidth, size, etc). 

Cheers, Andreas

On 2011-05-20, at 7:49 AM, Ken Hornstein <kenh at cmf.nrl.navy.mil> wrote:

> So I guess there are some things I _still_ don't understand about Lustre
> metadata handling.  Specifically, what metadata gets stored on OSTs and
> why.
> 
> What brings this all up is that a) we have users who have lots of files
> and b) we recently are doing through some reorganization that requires
> changing the groups on lots of these files (this is all running Lustre
> 1.8.4; we're due for an upgrade in the medium future).
> 
> I figured okay, this wouldn't be so bad, since those are all metadata
> server operations.  But I started running some tests, and I found out
> that chown() system calls perform poorly.
> 
> Because I was doing some previous metadata performance analysis, I took
> a souce code tree which consists of approximately 50,000 files and put
> two copies in one of our Lustre filesystems: one with the default striping
> (across all OSTs) and one where all files have no striping at all.  The
> performance between these two trees for stat() calls is large, as you
> can imagine, but the disparity between the chown() calls is even larger.
> You can run chgrp on all of the files in the no-striped copy in about
> 3-5 seconds, but the striped copy takes more than 50 seconds.
> 
> I did some more digging as to why this is.  I thought maybe at first that
> this is an issue on the client, but there is code in there that skips
> over talking to the OSTs for certain types of metadata updates, and turning
> on debugging on the client verifies that no setattr RPCs are being sent
> to the OSSes.  Looking more closely at the RPC traces reveals that the issue
> is on the metadata server; the setattr RPCs simply take longer when the
> files are striped.
> 
> I've looked at the metadata server code for a bit, and I've verified
> that the metadata server does send setattr RPCs to the OSSes, but I see
> that it's done asynchronously; it shouldn't be waiting for the
> replies.  So I'm stumped as to why this is happening.  I also realize
> that I'm still puzzled as to what metadata is stored on the OSTs; it seems
> like the client prefers the metadata from the MDS (except of course for
> size), but a fair amount of metadata is still stored on the OSSes.  Can
> anyone shed some light on this?
> 
> --Ken
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss



More information about the lustre-discuss mailing list