<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">

<HTML>

<HEAD>

<TITLE>RE: Re: [Lustre-discuss] lustre and small files overhead</TITLE>

</HEAD>

<BODY>

<!-- Converted from text/plain format -->


<P><FONT SIZE=2>I vote for the warning. A first time user does debug his setup until it all works with confidence and then should know to turn it off.<BR>

<BR>

For that you could have a startup warning plus emphasis in the documentation - ie mention it in the quickstart portion as well.<BR>

<BR>

Michael<BR>

<BR>

 -----Original Message-----<BR>

From:   Joe Barjo [<A HREF="mailto:jobarjo78@yahoo.fr">mailto:jobarjo78@yahoo.fr</A>]<BR>

Sent:   Monday, March 03, 2008 01:24 AM Pacific Standard Time<BR>

To:    <BR>

Cc:     lustre-discuss@lists.lustre.org<BR>

Subject:        Re: [Lustre-discuss] lustre and small files overhead<BR>

<BR>

Hi<BR>

Turning off debuging made it much better.<BR>

It went from 1m54 down to 25 seconds, but still 85% of system processing...<BR>

I really think you should turn off debuging by default, or make it appear as a BIG warning message.<BR>

People who are trying lustre for the first time are not going to debug lustre...<BR>

Also, looking in the documentation, though the debugging was documented, it was not clear that removing it would improve performance so much.<BR>

<BR>

I will now make more tests and see how the coherency is...<BR>

Thanks for your support.<BR>

<BR>

Andreas Dilger a écrit :<BR>

<BR>

        On Feb 29, 2008  15:37 +0100, Joe Barjo wrote:<BR>

         <BR>

<BR>

                We have a (small) 30 node sge based cluster with centos4 which will be<BR>

                growing to maximum 50 core duos.<BR>

                We use custom software that is based on gmake to launch parallel<BR>

                compilation and computations with lot of small files and some large files.<BR>

                We actualy use nfs and have a lot of problems with incoherencies between<BR>

                nodes.<BR>

               <BR>

                I'm currently evaluating lustre and have some questions about lustre<BR>

                overhead with small files.<BR>

                I succesfully installed the rpms on a test machine and launched the<BR>

                local lmount.sh script.<BR>

                   <BR>

<BR>

       <BR>

        Note that if you are using the unmodified llmount.sh script this is running<BR>

        on loopback files in /tmp, so the performance is likely quite bad.  For<BR>

        a realistic performance measure, put the MDT and OST on separate disks.<BR>

       <BR>

         <BR>

<BR>

                The first thing I tried is to make a svn checkout into it. (lot of small<BR>

                files...)<BR>

                It takes 1m54 from our local svn server versus 15s into a local ext3<BR>

                filesystem and 50s over nfs network.<BR>

                During the checkout, the processor (amd64 3200) is busy with 90% system.<BR>

               <BR>

                How come is there so much system process?<BR>

                   <BR>

<BR>

       <BR>

        Have you turned off debugging (sysctl -w lnet.debug=0)?<BR>

        Have you increased the DLM lock LRU sizes?<BR>

       <BR>

        for L in /proc/fs/lustre/ldlm/namespaces/*/lru_size; do<BR>

            echo 10000 > $L<BR>

        done<BR>

       <BR>

        In 1.6.5/1.8.0 it will be possible to use a new command to set<BR>

        this kind of parameter easier:<BR>

       <BR>

        lctl set_param ldlm.namespaces.*.lru_size=10000<BR>

       <BR>

         <BR>

<BR>

                Is there something to tweak to lower this overhead?<BR>

                Is there a specific tweak for small files?<BR>

                   <BR>

<BR>

       <BR>

        Not really, this isn't Lustre's strongest point.<BR>

       <BR>

         <BR>

<BR>

                Using multiple server nodes, will the performance be better?<BR>

                   <BR>

<BR>

       <BR>

        Partly.  There can only be a single MDT per filesystem, but it can<BR>

        scale quite well with multiple clients.  There can be many OSTs,<BR>

        but it isn't clear whether you are IO bound.  It probably wouldn't<BR>

        hurt to have a few to give you a high IOPS rate.<BR>

       <BR>

        Note that increasing OST count also by default allows clients to<BR>

        cache more dirty data (32MB/OST).  You can change this manually,<BR>

        it is by default tuned for very large clusters (000's of nodes).<BR>

       <BR>

        for C in /proc/fs/lustre/osc/*/max_dirty_mb<BR>

                echo 256 > $C<BR>

        done<BR>

       <BR>

        Similarly, in 1.6.5/1.8.0 it will be possible to do:<BR>

       <BR>

        lctl set_param osc.*.max_dirty_mb=256<BR>

       <BR>

        Cheers, Andreas<BR>

        --<BR>

        Andreas Dilger<BR>

        Sr. Staff Engineer, Lustre Group<BR>

        Sun Microsystems of Canada, Inc.<BR>

       <BR>

       <BR>

         <BR>

<BR>

<BR>

</FONT>

</P>


</BODY>

</HTML>