[Lustre-discuss] lustre and small files overhead

Joe Barjo jobarjo78 at yahoo.fr
Mon Mar 3 01:24:07 PST 2008


Hi
Turning off debuging made it much better.
It went from 1m54 down to 25 seconds, but still 85% of system processing...
I really think you should turn off debuging by default, or make it
appear as a BIG warning message.
People who are trying lustre for the first time are not going to debug
lustre...
Also, looking in the documentation, though the debugging was documented,
it was not clear that removing it would improve performance so much.

I will now make more tests and see how the coherency is...
Thanks for your support.

Andreas Dilger a écrit :
> On Feb 29, 2008  15:37 +0100, Joe Barjo wrote:
>   
>> We have a (small) 30 node sge based cluster with centos4 which will be
>> growing to maximum 50 core duos.
>> We use custom software that is based on gmake to launch parallel
>> compilation and computations with lot of small files and some large files.
>> We actualy use nfs and have a lot of problems with incoherencies between
>> nodes.
>>
>> I'm currently evaluating lustre and have some questions about lustre
>> overhead with small files.
>> I succesfully installed the rpms on a test machine and launched the
>> local lmount.sh script.
>>     
>
> Note that if you are using the unmodified llmount.sh script this is running
> on loopback files in /tmp, so the performance is likely quite bad.  For
> a realistic performance measure, put the MDT and OST on separate disks.
>
>   
>> The first thing I tried is to make a svn checkout into it. (lot of small
>> files...)
>> It takes 1m54 from our local svn server versus 15s into a local ext3
>> filesystem and 50s over nfs network.
>> During the checkout, the processor (amd64 3200) is busy with 90% system.
>>
>> How come is there so much system process?
>>     
>
> Have you turned off debugging (sysctl -w lnet.debug=0)?
> Have you increased the DLM lock LRU sizes?
>
> for L in /proc/fs/lustre/ldlm/namespaces/*/lru_size; do
>     echo 10000 > $L
> done
>
> In 1.6.5/1.8.0 it will be possible to use a new command to set
> this kind of parameter easier:
>
> lctl set_param ldlm.namespaces.*.lru_size=10000
>
>   
>> Is there something to tweak to lower this overhead?
>> Is there a specific tweak for small files?
>>     
>
> Not really, this isn't Lustre's strongest point.
>
>   
>> Using multiple server nodes, will the performance be better?
>>     
>
> Partly.  There can only be a single MDT per filesystem, but it can
> scale quite well with multiple clients.  There can be many OSTs,
> but it isn't clear whether you are IO bound.  It probably wouldn't
> hurt to have a few to give you a high IOPS rate.
>
> Note that increasing OST count also by default allows clients to
> cache more dirty data (32MB/OST).  You can change this manually,
> it is by default tuned for very large clusters (000's of nodes).
>
> for C in /proc/fs/lustre/osc/*/max_dirty_mb
> 	echo 256 > $C
> done
>
> Similarly, in 1.6.5/1.8.0 it will be possible to do:
>
> lctl set_param osc.*.max_dirty_mb=256
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20080303/2bf5fee1/attachment.htm>


More information about the lustre-discuss mailing list