[Lustre-discuss] Slow Directory Listing

Jeremy Filizetti jeremy.filizetti at gmail.com
Tue Sep 6 19:20:14 PDT 2011


The statahead_max is actually at llite.*.statahead_max not mdc.

Jeremy

On Tue, Sep 6, 2011 at 10:13 PM, Jeremy Filizetti <
jeremy.filizetti at gmail.com> wrote:

> You'll see a small benefit from mdc.*.statahead_max which will grab most of
> the information for the stat from the MDT, but you still pay a penalty
> because the glimpse for each OSC will be synchronous (but issued
> simultaneously to each OSC) IIRC.  There was some patches mentioned quite a
> while back for asynchronous glimpses but I didn't see much of a difference
> and never tried to narrow down why that was.  It also wasn't production
> ready.
>
> Hopefully you have extended attributes disabled for your Samba servers as
> well because it could be much worse if you have to listxattrs and then fetch
> each one.  Every file in Lustre also has a trusted.lov or lustre.lov xattr
> that stores the striping information, so for every file you'd be doing a
> listxattr and getxattr.  Quite a while back I wrote a samba module to filter
> out Lustre striping xattrs but not others, which helped a little.
>
> The last thing which doesn't sound like it is affecting you is how Lustre
> clients read directory pages.  Today in 1.8 and 2.0 each directory is read
> one page at a time (4k on x86_64).  If you have hundreds of thousands of
> files your directories they will be larger then a single page.  There are
> some patches already done by Whamcloud (
> http://jira.whamcloud.com/browse/LU-5) that allows had the directory
> readpage readahead a full bulk RPC which is 1 MB.  Depending on your
> environment this could be almost a 256x speed up.  This had a huge impact
> for me, but only seems significant if your not doing a stat on every file.
>
> Jeremy
>
> On Tue, Sep 6, 2011 at 8:21 PM, Indivar Nair <indivar.nair at techterra.in>wrote:
>
>> I have 1 OST Per OSS.
>> I have set the stripe to Lustre default. Should I still change the stripe
>> count to 1?
>> Yes, the reads are sequential.
>>
>> 1. Will increasing / decreasing the number of max_rpcs_in_flight help?
>> It is set to 32 now.
>>
>> 2. Just for experimenting, would it speed up listing if 1 OSS served 2
>> OSTs, thereby reducing the number of OSS and RPCs?
>>
>> Regards,
>>
>>
>> Indivar Nair
>>
>>
>> On Wed, Sep 7, 2011 at 1:28 AM, Adrian Ulrich <adrian at blinkenlights.ch>wrote:
>>
>>>
>>> > While normal file access works fine, the directory listing is extremely
>>> > slow.
>>> > Depending on the number of files in a directory, the listing takes
>>> around 5
>>> > - 15 secs.
>>> >
>>> > I tried 'ls --color=none' and it worked fine; listed the contents
>>> > immediately.
>>>
>>> That's because 'ls --color=always|auto' does an lstat() of each file
>>> (--color=none doesn't) which causes Lustre to send:
>>>
>>>  - 1 RPC to the MDS per file
>>>  - 1 RPC (per file) to EACH OSS where the file is stored to get the
>>> filesize
>>>
>>> Some time ago i've created a patch to speed up 'ls' while keeping (most)
>>> of the colors
>>> (
>>> https://github.com/adrian-bl/patchwork/blob/master/coreutils/ls/ls-lustre.diff
>>> )
>>>
>>> But patching samba will not be possible in your case as it really needs
>>> the information returned by stat().
>>>
>>>
>>> > Double clicking on directory takes a long long time to display.
>>>
>>> Attach `strace' to samba: It will probably be busy doing lstat() which is
>>> a 'slow' operation on Lustre in any case.
>>>
>>>
>>>
>>> > The cluster consist of -
>>> > - two DRBD Mirrored MDS Servers (Dell R610s) with 10K RPM disks
>>> > - four OSS Nodes (2 Node Cluster (Dell R710s) with a common storage
>>> (Dell
>>> > MD3200))
>>>
>>> How many OSTs do you have per OSS?
>>> What's your stripe setting? Setting the stripe to 1 could give you a huge
>>> speedup (without affecting normal I/O as i assume that the 9MB files are
>>> read/written sequentially)
>>>
>>>
>>> Regards,
>>>  Adrian
>>>
>>>
>>> --
>>>  RFC 1925:
>>>   (11) Every old idea will be proposed again with a different name and
>>>        a different presentation, regardless of whether it works.
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20110906/2152a092/attachment.htm>


More information about the lustre-discuss mailing list