<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

    <title></title>

  </head>

  <body text="#000000" bgcolor="#ffffff">

    Inline comments as following:<br>

    <br>

    On 5/30/11 1:51 PM, Jinshan Xiong wrote:

    <blockquote

      cite="mid:BA5D598A-2A89-48DF-A67A-4ACDD8B1F409@whamcloud.com"

      type="cite"><base href="x-msg://164/"><br>

      <div>

        <div>On May 26, 2011, at 6:01 AM, Eric Barton wrote:</div>

        <br class="Apple-interchange-newline">

        <blockquote type="cite"><span class="Apple-style-span"

            style="border-collapse: separate; font-family: 'Trebuchet

            MS'; font-style: normal; font-variant: normal; font-weight:

            normal; letter-spacing: normal; line-height: normal;

            orphans: 2; text-indent: 0px; text-transform: none;

            white-space: normal; widows: 2; word-spacing: 0px;

            font-size: medium;">

            <div bgcolor="white" link="blue" vlink="purple" lang="EN-GB">

              <div class="WordSection1" style="page: WordSection1;">

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);">Nasf,<o:p></o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);"><o:p> </o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);">Interesting

                    results.  Thank you - especially for graphing the

                    results so thoroughly.<o:p></o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);">I’m attaching them

                    here and cc-ing lustre-devel since these are of

                    general interest.<o:p></o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);"><o:p> </o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);">I don’t think your

                    conclusion number (1), to say CLIO locking is

                    slowing us down<o:p></o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);">is as obvious from

                    these results as you imply.  If you just compare the

                    1.8 and<o:p></o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);">patched 2.x

                    per-file times and how they scale with #stripes you

                    get this…<o:p></o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);"><o:p> </o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);"><span><image001.png></span></span><span

                    style="color: rgb(31, 73, 125);"><o:p></o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);"><o:p> </o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);">The gradients of

                    these lines should correspond to the additional time

                    per stripe required<o:p></o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);">to stat each file

                    and I’ve graphed these times below (ignoring the

                    0-stripe data for this<o:p></o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);">calculation because

                    I’m just interested in the incremental per-stripe

                    overhead).<o:p></o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);"><o:p> </o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);"><span><image004.png></span></span><span

                    style="color: rgb(31, 73, 125);"><o:p></o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);">They show

                    per-stripe overhead for 1.8 well above patched 2.x

                    for the lower stripe<o:p></o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);">counts, but whereas

                    1.8 gets better with more stripes, patched 2.x gets

                    worse.  I’m<o:p></o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);">guessing that at

                    high stripe counts, 1.8 puts many concurrent

                    glimpses on the wire<o:p></o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);">and does it quite

                    efficiently.  I’d like to understand better how you

                    control the #<o:p></o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);">of glimpse-aheads

                    you keep on the wire – is it a single fixed number,

                    or a fixed<o:p></o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);">number per OST or

                    some other scheme?  In any case, it will be

                    interesting to see<o:p></o:p></span></div>

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><span

                    style="color: rgb(31, 73, 125);">measurements at

                    higher stripe counts.<o:p></o:p></span></div>

                <blockquote style="margin-top: 5pt; margin-bottom: 5pt;">

                  <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                    font-family: 'Times New Roman',serif; color: black;"><span

                      style="color: rgb(31, 73, 125);" lang="EN-US">Cheers,<span

                        class="Apple-converted-space"> </span><br>

                                         Eric<o:p></o:p></span></div>

                </blockquote>

                <div style="border-style: none none none solid;

                  border-left: 1.5pt solid blue; padding: 0cm 0cm 0cm

                  4pt; position: static; z-index: auto;">

                  <div>

                    <div style="border-style: solid none none;

                      border-top: 1pt solid rgb(181, 196, 223); padding:

                      3pt 0cm 0cm;">

                      <div style="margin: 0cm 0cm 0.0001pt; font-size:

                        12pt; font-family: 'Times New Roman',serif;

                        color: black;"><b><span style="font-size: 10pt;

                            font-family: Tahoma,sans-serif; color:

                            windowtext;" lang="EN-US">From:</span></b><span

                          style="font-size: 10pt; font-family:

                          Tahoma,sans-serif; color: windowtext;"

                          lang="EN-US"><span

                            class="Apple-converted-space"> </span>Fan

                          Yong [<a class="moz-txt-link-freetext" href="mailto:yong.fan@whamcloud.com">mailto:yong.fan@whamcloud.com</a>]<span

                            class="Apple-converted-space"> </span><br>

                          <b>Sent:</b><span

                            class="Apple-converted-space"> </span>12 May

                          2011 10:18 AM<br>

                          <b>To:</b><span class="Apple-converted-space"> </span>Eric

                          Barton<br>

                          <b>Cc:</b><span class="Apple-converted-space"> </span>Bryon

                          Neitzel; Ian Colle; Liang Zhen<br>

                          <b>Subject:</b><span

                            class="Apple-converted-space"> </span>New

                          test results for "ls -Ul"<o:p></o:p></span></div>

                    </div>

                  </div>

                  <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                    font-family: 'Times New Roman',serif; color: black;"><o:p> </o:p></div>

                  <p class="MsoNormal" style="margin: 0cm 0cm 12pt;

                    font-size: 12pt; font-family: 'Times New

                    Roman',serif; color: black;">I have improved

                    statahead load balance mechanism to distribute

                    statahead load to more CPU units on client. And

                    adjusted AGL according to CLIO lock state machine.

                    After those improvement, 'ls -Ul' can run more fast

                    than old patches, especially on large SMP node.<br>

                    <br>

                    On the other hand, as the increasing the degree of

                    parallelism, the lower network scheduler is becoming

                    performance bottleneck. So I combine my patches

                    together with Liang's SMP patches in the test.<o:p></o:p></p>

                  <table class="MsoNormalTable" style="width: 1019px;"

                    width="100%" border="1" cellpadding="0">

                    <tbody>

                      <tr>

                        <td style="padding: 1.5pt;" valign="top"><br>

                        </td>

                        <td style="padding: 1.5pt;" valign="top">

                          <div style="margin: 0cm 0cm 0.0001pt;

                            font-size: 12pt; font-family: 'Times New

                            Roman',serif; color: black;">client

                            (fat-intel-4, 24 cores)<o:p></o:p></div>

                        </td>

                        <td style="padding: 1.5pt;" valign="top">

                          <div style="margin: 0cm 0cm 0.0001pt;

                            font-size: 12pt; font-family: 'Times New

                            Roman',serif; color: black;">server

                            (client-xxx, 4 OSSes, 8 OSTs on each OSS)<o:p></o:p></div>

                        </td>

                      </tr>

                      <tr>

                        <td style="padding: 1.5pt;" valign="top">

                          <div style="margin: 0cm 0cm 0.0001pt;

                            font-size: 12pt; font-family: 'Times New

                            Roman',serif; color: black;">b2x_patched<o:p></o:p></div>

                        </td>

                        <td style="padding: 1.5pt;" valign="top">

                          <div style="margin: 0cm 0cm 0.0001pt;

                            font-size: 12pt; font-family: 'Times New

                            Roman',serif; color: black;">my patches +

                            SMP patches<o:p></o:p></div>

                        </td>

                        <td style="padding: 1.5pt;" valign="top">

                          <div style="margin: 0cm 0cm 0.0001pt;

                            font-size: 12pt; font-family: 'Times New

                            Roman',serif; color: black;">my patches<o:p></o:p></div>

                        </td>

                      </tr>

                      <tr>

                        <td style="padding: 1.5pt;" valign="top">

                          <div style="margin: 0cm 0cm 0.0001pt;

                            font-size: 12pt; font-family: 'Times New

                            Roman',serif; color: black;">b18<o:p></o:p></div>

                        </td>

                        <td style="padding: 1.5pt;" valign="top">

                          <div style="margin: 0cm 0cm 0.0001pt;

                            font-size: 12pt; font-family: 'Times New

                            Roman',serif; color: black;">original b1_8<o:p></o:p></div>

                        </td>

                        <td style="padding: 1.5pt;" valign="top">

                          <div style="margin: 0cm 0cm 0.0001pt;

                            font-size: 12pt; font-family: 'Times New

                            Roman',serif; color: black;">share the same

                            server with "b2x_patched"<o:p></o:p></div>

                        </td>

                      </tr>

                      <tr>

                        <td style="padding: 1.5pt;" valign="top">

                          <div style="margin: 0cm 0cm 0.0001pt;

                            font-size: 12pt; font-family: 'Times New

                            Roman',serif; color: black;">b2x_original<o:p></o:p></div>

                        </td>

                        <td style="padding: 1.5pt;" valign="top">

                          <div style="margin: 0cm 0cm 0.0001pt;

                            font-size: 12pt; font-family: 'Times New

                            Roman',serif; color: black;">original b2_x<o:p></o:p></div>

                        </td>

                        <td style="padding: 1.5pt;" valign="top">

                          <div style="margin: 0cm 0cm 0.0001pt;

                            font-size: 12pt; font-family: 'Times New

                            Roman',serif; color: black;">original b2_x<o:p></o:p></div>

                        </td>

                      </tr>

                    </tbody>

                  </table>

                  <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                    font-family: 'Times New Roman',serif; color: black;"><br>

                    Some notes:<br>

                    <br>

                    1) Stripe count affects traversing performance much,

                    and the impact is more than linear. Even if with all

                    the patches applied on b2_x, the degree of stripe

                    count impact is still larger than b1_8. It is

                    related with the complex CLIO lock state machine and

                    tedious iteration/repeat operations. It is not easy

                    to make it run as efficiently as b1_8.<br>

                  </div>

                </div>

              </div>

            </div>

          </span></blockquote>

        <div><br>

        </div>

        <div><br>

        </div>

        <div>Hi there,</div>

        <div><br>

        </div>

        <div>I did some tests to investigate the overhead of clio lock

          state machine and glimpse lock, and I found something new.</div>

        <div><br>

        </div>

        <div>Basically I did the same thing as what Nasf had done, but I

          only cared about the overhead of glimpse locks. For this

          purpose, I ran 'ls -lU' twice for each test, and the 1st run

          is only used to create IBITS UPDATE lock cache for files;

          then, I dropped cl_locks and ldlm_locks from client side cache

          by setting zero to lru_size of ldlm namespaces, then do 'ls

          -lU' once again. In the second run of 'ls -lU', the statahead

          thread will always find cached IBITS lock(we can check mdc

          lock_count for sure), so the elapsed time of ls will be

          glimpse related.</div>

        <div><br>

        </div>

        <div>This is what I got from the test:</div>

        <div><br>

        </div>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <base href="x-msg://164/">

      <div class="AppleOriginalContents">

        <div><br>

        </div>

        <div><br>

        </div>

        <div>Description and test environment:</div>

        <div>- `ls -Ul time' means the time to finish the second run; </div>

        <div>- 100K means 100K files under the same directory; 400K

          means 400K files under the same directory;</div>

        <div>- there are two OSSes in my test, and each OSS has 8

          OSTs; OSTs are crossed over on two OSSes, i.e., OST0, 2, 4,..

          are on OSS0; 1, 3, 5, .. are on OSS1;</div>

        <div>- each node has 12G memory, 4 CPU cores;</div>

        <div>- latest lustre-master build, b140</div>

        <div><br>

        </div>

        <div>and, prorated per stripe overhead:</div>

        <div><br>

        </div>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <base href="x-msg://164/">

      <div class="AppleOriginalContents">

        <div><br>

        </div>

        <div><br>

        </div>

        <div>From the above test, it's very hard to make the conclusion

          that cl_lock causes the increase of ls time by the stripe

          count.</div>

        <div><br>

        </div>

        <div>Here is the test script I used to do the test, and test

          output is attached as well. Please let me know if I missed

          something.</div>

      </div>

    </blockquote>

    <br>

    <br>

    In theory, processing glimpse RPC for each stripe of the same file

    should be in parallel. So means more stripe count, then less average

    overhead per-stripe, at least it is the expectation. Flat line

    cannot indicate the overhead is small enough. I suggest to compare

    with b1_8 for the same tests.<br>

    <br>

    <br>

    <blockquote

      cite="mid:BA5D598A-2A89-48DF-A67A-4ACDD8B1F409@whamcloud.com"

      type="cite">

      <div class="AppleOriginalContents">

        <div><br>

        </div>

        <div><br>

        </div>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <base href="x-msg://164/">

      <div>

        <div><br>

        </div>

        <div><br>

        </div>

        <div>===================</div>

        <div>Let's take a step back to reconsider what's real cause in

          Nasf's test. I tend to think the load on OSSes might cause

          that symptom. It's obvious that Async Glimpse Lock produces

          more stress on OSS, especially in his test env where multiple

          OSTs are actually on the same OSS. This will make the ls time

          increased by the stripe count as well - since OSS has to

          handle more RPCs when the stripe count increases in a specific

          time. This problem may be mitigated by distributing OSTs to

          more OSSes.</div>

      </div>

    </blockquote>

    <br>

    <br>

    Basically, I agree with you that the heavy load on OSS may be the

    performance bottleneck, just as I said in former email, we found the

    CPU loads on OSS were quite high when "ls -Ul" for large-striped

    cases. It is easy to be verified as long as we have enough powerful

    OSSes, unfortunately we have not now.<br>

    <br>

    Cheers,<br>

    --<br>

    Nasf<br>

    <br>

    <br>

    <blockquote

      cite="mid:BA5D598A-2A89-48DF-A67A-4ACDD8B1F409@whamcloud.com"

      type="cite">

      <div>

        <div><br>

        </div>

        <div>Thanks,</div>

        <div>Jinshan</div>

        <br>

        <blockquote type="cite">

          <div bgcolor="white" link="blue" vlink="purple" lang="EN-GB">

            <div class="WordSection1" style="page: WordSection1;">

              <div style="border-style: none none none solid;

                border-left: 1.5pt solid blue; padding: 0cm 0cm 0cm 4pt;

                position: static; z-index: auto;">

                <div style="margin: 0cm 0cm 0.0001pt; font-size: 12pt;

                  font-family: 'Times New Roman',serif; color: black;"><br>

                  2) Patched b2_x is much faster than original b2_x, for

                  traversing 400K * 32-striped directory, it is 100

                  times or more improved.<br>

                  <br>

                  3) Patched b2_x is also faster than b1_8, within our

                  test, patched b2_x is at least 4X faster than b1_8,

                  which matches the requirement in ORNL contract.<br>

                  <br>

                  4) Original b2_x is faster than b1_8 only for small

                  striped cases, not more than 4-striped. For large

                  striped cases, slower than b1_8, which is consistent

                  with ORNL test result.<br>

                  <br>

                  5) The largest stripe count is 32 in our test. We have

                  not enough resource to test more large striped cases.

                  And I also wonder whether it is worth to test more

                  large striped directory or not. Because how many

                  customers want to use large and full striped

                  directory? means contains 1M * 160-striped items in

                  signal directory. If it is rare case, then wasting

                  lots of time on that is worthless.<br>

                  <br>

                  We need to confirm with ORNL what is the last

                  acceptance test cases and environment, includes:<br>

                  a) stripe count<br>

                  b) item count<br>

                  c) network latency, w/o lnet router, suggest without

                  router.<br>

                  d) OST count on each OSS<br>

                  <br>

                  <br>

                  Cheers,<br>

                  --<br>

                  Nasf<o:p></o:p></div>

              </div>

            </div>

            <span><result_20110512.xls></span>_______________________________________________<br>

            Lustre-devel mailing list<br>

            <a moz-do-not-send="true"

              href="mailto:Lustre-devel@lists.lustre.org" style="color:

              blue; text-decoration: underline;">Lustre-devel@lists.lustre.org</a><br>

            <a moz-do-not-send="true"

              href="http://lists.lustre.org/mailman/listinfo/lustre-devel"

              style="color: blue; text-decoration: underline;">http://lists.lustre.org/mailman/listinfo/lustre-devel</a><br>

          </div>

        </blockquote>

      </div>

      <br>

    </blockquote>

    <br>

  </body>

</html>