<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=us-ascii"><meta name=Generator content="Microsoft Word 12 (filtered medium)"><!--[if !mso]><style>v\:* {behavior:url(#default#VML);}

o\:* {behavior:url(#default#VML);}

w\:* {behavior:url(#default#VML);}

.shape {behavior:url(#default#VML);}

</style><![endif]--><style><!--

/* Font Definitions */

@font-face

        {font-family:"Cambria Math";

        panose-1:2 4 5 3 5 4 6 3 2 4;}

@font-face

        {font-family:Calibri;

        panose-1:2 15 5 2 2 2 4 3 2 4;}

@font-face

        {font-family:Tahoma;

        panose-1:2 11 6 4 3 5 4 4 2 4;}

/* Style Definitions */

p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0cm;

        margin-bottom:.0001pt;

        font-size:12.0pt;

        font-family:"Times New Roman","serif";

        color:black;}

a:link, span.MsoHyperlink

        {mso-style-priority:99;

        color:blue;

        text-decoration:underline;}

a:visited, span.MsoHyperlinkFollowed

        {mso-style-priority:99;

        color:purple;

        text-decoration:underline;}

p.MsoAcetate, li.MsoAcetate, div.MsoAcetate

        {mso-style-priority:99;

        mso-style-link:"Balloon Text Char";

        margin:0cm;

        margin-bottom:.0001pt;

        font-size:8.0pt;

        font-family:"Tahoma","sans-serif";

        color:black;}

span.EmailStyle17

        {mso-style-type:personal-reply;

        font-family:"Times New Roman","serif";

        color:#1F497D;}

span.BalloonTextChar

        {mso-style-name:"Balloon Text Char";

        mso-style-priority:99;

        mso-style-link:"Balloon Text";

        font-family:"Tahoma","sans-serif";

        color:black;}

.MsoChpDefault

        {mso-style-type:export-only;

        font-size:10.0pt;}

@page WordSection1

        {size:612.0pt 792.0pt;

        margin:72.0pt 72.0pt 72.0pt 72.0pt;}

div.WordSection1

        {page:WordSection1;}

--></style><!--[if gte mso 9]><xml>

<o:shapedefaults v:ext="edit" spidmax="2050" />

</xml><![endif]--><!--[if gte mso 9]><xml>

<o:shapelayout v:ext="edit">

<o:idmap v:ext="edit" data="1" />

</o:shapelayout></xml><![endif]--></head><body bgcolor=white lang=EN-GB link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='color:#1F497D'>Nasf,<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>Interesting results.  Thank you - especially for graphing the results so thoroughly.<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>I’m attaching them here and cc-ing lustre-devel since these are of general interest.<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>I don’t think your conclusion number (1), to say CLIO locking is slowing us down<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>is as obvious from these results as you imply.  If you just compare the 1.8 and<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>patched 2.x per-file times and how they scale with #stripes you get this…<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='color:#1F497D'><img width=668 height=461 id="Chart_x0020_3" src="cid:image001.png@01CC1BAD.57331740"></span><span style='color:#1F497D'><o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>The gradients of these lines should correspond to the additional time per stripe required<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>to stat each file and I’ve graphed these times below (ignoring the 0-stripe data for this<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>calculation because I’m just interested in the incremental per-stripe overhead).<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='color:#1F497D'><img width=668 height=371 id="Chart_x0020_5" src="cid:image004.png@01CC1BAD.57331740"></span><span style='color:#1F497D'><o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>They show per-stripe overhead for 1.8 well above patched 2.x for the lower stripe<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>counts, but whereas 1.8 gets better with more stripes, patched 2.x gets worse.  I’m<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>guessing that at high stripe counts, 1.8 puts many concurrent glimpses on the wire<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>and does it quite efficiently.  I’d like to understand better how you control the #<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>of glimpse-aheads you keep on the wire – is it a single fixed number, or a fixed<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>number per OST or some other scheme?  In any case, it will be interesting to see<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>measurements at higher stripe counts.<o:p></o:p></span></p><blockquote style='margin-top:5.0pt;margin-bottom:5.0pt'><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='color:#1F497D'>Cheers, <br>                   Eric <o:p></o:p></span></p></blockquote><div style='border:none;border-left:solid blue 1.5pt;padding:0cm 0cm 0cm 4.0pt'><div><div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal><b><span lang=EN-US style='font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowtext'>From:</span></b><span lang=EN-US style='font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowtext'> Fan Yong [mailto:yong.fan@whamcloud.com] <br><b>Sent:</b> 12 May 2011 10:18 AM<br><b>To:</b> Eric Barton<br><b>Cc:</b> Bryon Neitzel; Ian Colle; Liang Zhen<br><b>Subject:</b> New test results for "ls -Ul"<o:p></o:p></span></p></div></div><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal style='margin-bottom:12.0pt'>I have improved statahead load balance mechanism to distribute statahead load to more CPU units on client. And adjusted AGL according to CLIO lock state machine. After those improvement, 'ls -Ul' can run more fast than old patches, especially on large SMP node.<br><br>On the other hand, as the increasing the degree of parallelism, the lower network scheduler is becoming performance bottleneck. So I combine my patches together with Liang's SMP patches in the test.<o:p></o:p></p><table class=MsoNormalTable border=1 cellpadding=0 width="100%" style='width:100.0%'><tr><td valign=top style='padding:1.5pt 1.5pt 1.5pt 1.5pt'></td><td valign=top style='padding:1.5pt 1.5pt 1.5pt 1.5pt'><p class=MsoNormal>client (fat-intel-4, 24 cores)<o:p></o:p></p></td><td valign=top style='padding:1.5pt 1.5pt 1.5pt 1.5pt'><p class=MsoNormal>server (client-xxx, 4 OSSes, 8 OSTs on each OSS)<o:p></o:p></p></td></tr><tr><td valign=top style='padding:1.5pt 1.5pt 1.5pt 1.5pt'><p class=MsoNormal>b2x_patched<o:p></o:p></p></td><td valign=top style='padding:1.5pt 1.5pt 1.5pt 1.5pt'><p class=MsoNormal>my patches + SMP patches<o:p></o:p></p></td><td valign=top style='padding:1.5pt 1.5pt 1.5pt 1.5pt'><p class=MsoNormal>my patches<o:p></o:p></p></td></tr><tr><td valign=top style='padding:1.5pt 1.5pt 1.5pt 1.5pt'><p class=MsoNormal>b18<o:p></o:p></p></td><td valign=top style='padding:1.5pt 1.5pt 1.5pt 1.5pt'><p class=MsoNormal>original b1_8<o:p></o:p></p></td><td valign=top style='padding:1.5pt 1.5pt 1.5pt 1.5pt'><p class=MsoNormal>share the same server with "b2x_patched"<o:p></o:p></p></td></tr><tr><td valign=top style='padding:1.5pt 1.5pt 1.5pt 1.5pt'><p class=MsoNormal>b2x_original<o:p></o:p></p></td><td valign=top style='padding:1.5pt 1.5pt 1.5pt 1.5pt'><p class=MsoNormal>original b2_x<o:p></o:p></p></td><td valign=top style='padding:1.5pt 1.5pt 1.5pt 1.5pt'><p class=MsoNormal>original b2_x<o:p></o:p></p></td></tr></table><p class=MsoNormal><br>Some notes:<br><br>1) Stripe count affects traversing performance much, and the impact is more than linear. Even if with all the patches applied on b2_x, the degree of stripe count impact is still larger than b1_8. It is related with the complex CLIO lock state machine and tedious iteration/repeat operations. It is not easy to make it run as efficiently as b1_8.<br><br>2) Patched b2_x is much faster than original b2_x, for traversing 400K * 32-striped directory, it is 100 times or more improved.<br><br>3) Patched b2_x is also faster than b1_8, within our test, patched b2_x is at least 4X faster than b1_8, which matches the requirement in ORNL contract.<br><br>4) Original b2_x is faster than b1_8 only for small striped cases, not more than 4-striped. For large striped cases, slower than b1_8, which is consistent with ORNL test result.<br><br>5) The largest stripe count is 32 in our test. We have not enough resource to test more large striped cases. And I also wonder whether it is worth to test more large striped directory or not. Because how many customers want to use large and full striped directory? means contains 1M * 160-striped items in signal directory. If it is rare case, then wasting lots of time on that is worthless.<br><br>We need to confirm with ORNL what is the last acceptance test cases and environment, includes:<br>a) stripe count<br>b) item count<br>c) network latency, w/o lnet router, suggest without router.<br>d) OST count on each OSS<br><br><br>Cheers,<br>--<br>Nasf<o:p></o:p></p></div></div></body></html>