<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>Hi,</p>
    <p>Maybe, you can also try this :<br>
    </p>
    <p> <a href="https://github.com/quentinbouyer/topmdt">https://github.com/quentinbouyer/topmdt</a></p>
    <div class="moz-cite-prefix">Le 28/05/2020 à 18:32, Chad DeWitt a
      écrit :<br>
    </div>
    <blockquote type="cite"
cite="mid:CAAyf6vCiTJA0SgT=sXh_Ew6JLGqF3EjZt+jimQX0+9pW3U1dww@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">
        <div dir="ltr">
          <div dir="ltr">
            <div dir="ltr">
              <div dir="ltr">Hi Heath,
                <div><br>
                </div>
                <div>Hope you're doing well!</div>
                <div><br>
                </div>
                <div>Your mileage may vary (and quite frankly, there may
                  be better approaches), but this is a quick and dirty
                  set of steps to find which client is issuing a large
                  number of metadata operations.:</div>
              </div>
            </div>
            <blockquote style="margin:0px 0px 0px
              40px;border:none;padding:0px">
              <div>
                <ul>
                  <li>Log into the affected MDS.</li>
                </ul>
              </div>
            </blockquote>
            <blockquote style="margin:0px 0px 0px
              40px;border:none;padding:0px">
              <div>
                <ul>
                  <li>Change into the exports directory.<br>
                  </li>
                </ul>
              </div>
            </blockquote>
            <blockquote style="margin:0px 0px 0px
              40px;border:none;padding:0px">
              <blockquote style="margin:0px 0px 0px
                40px;border:none;padding:0px">
                <div>
                  <div>
                    <div>
                      <div><font face="monospace">cd
                          /proc/fs/lustre/mdt/<i><Your affected
                            MDT></i>/exports/</font></div>
                    </div>
                  </div>
                </div>
              </blockquote>
            </blockquote>
            <blockquote style="margin:0px 0px 0px
              40px;border:none;padding:0px">
              <ul>
                <li>OPTIONAL: Set all your stats to zero and clear out
                  stale clients. (If you don't want to do this step, you
                  don't really have to, but it does make it easier to
                  see the stats if you are starting with a clean slate.
                  In fact, you may want to skip this the first time
                  through and just look for high numbers. If a
                  particular client is the source of the issue, the
                  stats should clearly be higher for that client when
                  compared to the others.)<br>
                </li>
              </ul>
            </blockquote>
            <blockquote style="margin:0px 0px 0px
              40px;border:none;padding:0px">
              <blockquote style="margin:0px 0px 0px
                40px;border:none;padding:0px">
                <div>
                  <div>
                    <div>
                      <div><font face="monospace">echo "C" > clear</font></div>
                    </div>
                  </div>
                </div>
              </blockquote>
            </blockquote>
            <blockquote style="margin:0px 0px 0px
              40px;border:none;padding:0px">
              <div>
                <ul>
                  <li>Wait for a few seconds and dump the stats.<br>
                  </li>
                </ul>
              </div>
            </blockquote>
            <blockquote style="margin:0px 0px 0px
              40px;border:none;padding:0px">
              <blockquote style="margin:0px 0px 0px
                40px;border:none;padding:0px"><font face="monospace">for
                  client in $( ls -d */ ) ; do echo && echo
                  && echo ${client} && cat
                  ${client}/stats && echo ; done</font><br>
              </blockquote>
            </blockquote>
            <div dir="ltr">
              <div dir="ltr">
                <div><br>
                </div>
                <div>You'll get a listing of stats for each mounted
                  client like so:</div>
                <div><br>
                </div>
              </div>
            </div>
          </div>
        </div>
        <blockquote style="margin:0 0 0 40px;border:none;padding:0px">
          <div>
            <div>
              <div>
                <div>
                  <div><font face="monospace">open                    
                       278676 samples [reqs]</font></div>
                </div>
              </div>
            </div>
          </div>
          <div>
            <div>
              <div>
                <div>
                  <div><font face="monospace">close                    
                      278629 samples [reqs]</font></div>
                </div>
              </div>
            </div>
          </div>
          <div>
            <div>
              <div>
                <div>
                  <div><font face="monospace">mknod                    
                      2320 samples [reqs]</font></div>
                </div>
              </div>
            </div>
          </div>
          <div>
            <div>
              <div>
                <div>
                  <div><font face="monospace">unlink                  
                       495 samples [reqs]</font></div>
                </div>
              </div>
            </div>
          </div>
          <div>
            <div>
              <div>
                <div>
                  <div><font face="monospace">mkdir                    
                      575 samples [reqs]</font></div>
                </div>
              </div>
            </div>
          </div>
          <div>
            <div>
              <div>
                <div>
                  <div><font face="monospace">rename                  
                       1534 samples [reqs]</font></div>
                </div>
              </div>
            </div>
          </div>
          <div>
            <div>
              <div>
                <div>
                  <div><font face="monospace">getattr                  
                      277552 samples [reqs]</font></div>
                </div>
              </div>
            </div>
          </div>
          <div>
            <div>
              <div>
                <div>
                  <div><font face="monospace">setattr                  
                      550 samples [reqs]</font></div>
                </div>
              </div>
            </div>
          </div>
          <div>
            <div>
              <div>
                <div>
                  <div><font face="monospace">getxattr                
                       2742 samples [reqs]</font></div>
                </div>
              </div>
            </div>
          </div>
          <div>
            <div>
              <div>
                <div>
                  <div><font face="monospace">statfs                  
                       350058 samples [reqs]</font></div>
                </div>
              </div>
            </div>
          </div>
          <div>
            <div>
              <div>
                <div>
                  <div><font face="monospace">samedir_rename          
                       1534 samples [reqs]</font></div>
                </div>
              </div>
            </div>
          </div>
        </blockquote>
        <div dir="ltr">
          <div dir="ltr">
            <div dir="ltr">
              <div dir="ltr">
                <div><br>
                </div>
                <div>(Don't worry if some of the clients give back what
                  appears to be empty stats. That just means they are
                  mounted, but have not yet performed any metadata
                  operations.) From this data, you are looking for any
                  "high" samples.  The client with the high samples is
                  usually the culprit.  For the example client stats
                  above, I would look to see what process(es) on this
                  client is listing, opening, and then closing files in
                  Lustre... The advantage with this method is you are
                  seeing exactly which metadata operations are
                  occurring. (I know there are also various utilities
                  included with Lustre that may give this information as
                  well, but I just go to the source.)</div>
                <div><br>
                </div>
                <div>Once you find the client, you can use various
                  commands, such as <font face="monospace">mount</font>
                  and <font face="monospace">lsof</font> to get a
                  better understanding of what may be hitting Lustre.</div>
                <div><br>
                </div>
                <div>Some of the more common issues I've found that can
                  cause a high MDS load:</div>
                <div>
                  <ul>
                    <li>List a directory containing a large number of
                      files. (Instead, unalias <font face="monospace">ls</font>
                      or better yet, use <font face="monospace">lfs
                        find</font>.)</li>
                    <li>Remove on many files.</li>
                    <li>Open and close many files. (May be better to
                      move the data over to another file system, such as
                      XFS, etc.  We keep some of our deep learning off
                      Lustre, because of the sheer number of small
                      files.)</li>
                  </ul>
                  Of course the actual mitigation of the load depends on
                  what the user is attempting to do...</div>
                <div><br>
                </div>
                <div>I hope this helps...</div>
                <div><br>
                </div>
                <div>Cheers,</div>
                <div>Chad</div>
                <div><br clear="all">
                  <div>
                    <div dir="ltr">
                      <div dir="ltr">
                        <div dir="ltr">
                          <div dir="ltr">
                            <div dir="ltr">
                              <div dir="ltr">
                                <p style="margin:0in 0in 0.0001pt"><span
style="background-color:rgba(255,255,255,0)"><font size="2"
                                      face="monospace, monospace">------------------------------------------------------------</font></span></p>
                                <p style="margin:0in 0in 0.0001pt"><span
style="background-color:rgba(255,255,255,0)"><font size="2"
                                      face="monospace, monospace">Chad
                                      DeWitt, CISSP</font></span></p>
                                <p style="margin:0in 0in 0.0001pt"><span
style="background-color:rgba(255,255,255,0)"><font size="2"
                                      face="monospace, monospace">UNC
                                      Charlotte <b>| </b>ITS –
                                      University Research Computing</font></span></p>
                                <p style="margin:0in 0in 0.0001pt"><span
style="background-color:rgba(255,255,255,0)"></span></p>
                                <p style="margin:0in 0in 0.0001pt"><font
                                    size="2" face="monospace, monospace"
                                    color="#000000"><span
                                      style="background-color:rgba(255,255,255,0)"><a
                                        href="mailto:ccdewitt@uncc.edu"
                                        style="color:rgb(17,85,204)"
                                        target="_blank"
                                        moz-do-not-send="true">ccdewitt@uncc.edu</a> <b>| </b><a
                                        style="color:rgb(34,34,34)"
                                        moz-do-not-send="true">www.uncc.edu</a></span></font></p>
                                <p style="margin:0in 0in 0.0001pt"><span
style="background-color:rgba(255,255,255,0)"><font size="2"
                                      face="monospace, monospace">------------------------------------------------------------</font></span></p>
                                <p style="margin:0in 0in 0.0001pt"><span
style="background-color:rgba(255,255,255,0)"><font size="2"
                                      face="monospace, monospace"><br>
                                    </font></span></p>
                                <p style="margin:0in 0in 0.0001pt">If
                                  you are not the <span>intended</span> recipient
                                  of this transmission or a person
                                  responsible for delivering it to the <span>intended</span> recipient,
                                  any disclosure, copying, distribution,
                                  or other use of any of the information
                                  in this transmission is strictly
                                  prohibited. If you have received this
                                  transmission in error, please notify
                                  me immediately by reply email or by
                                  telephone at 704-687-7802. Thank you.<span
style="background-color:rgba(255,255,255,0)"><font size="2"
                                      face="monospace, monospace"><br>
                                    </font></span></p>
                              </div>
                            </div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                  <br>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Thu, May 28, 2020 at 11:37
          AM Peeples, Heath <<a href="mailto:heathp@hpc.msstate.edu"
            target="_blank" moz-do-not-send="true">heathp@hpc.msstate.edu</a>>
          wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
          <div lang="EN-US">
            <div>
              <p class="MsoNormal">I have 2 MDSs and periodically on one
                of them (either at one time or another) peak above 300,
                causing the file system to basically stop.  This lasts
                for a few minutes and then goes away.  We can’t identify
                any one user running jobs at the times we see this, so
                it’s hard to pinpoint this on a user doing something to
                cause it.   Could anyone point me in the direction of
                how to begin debugging this?  Any help is greatly
                appreciated.</p>
              <p class="MsoNormal"> </p>
              <p class="MsoNormal">Heath</p>
            </div>
          </div>
          _______________________________________________<br>
          lustre-discuss mailing list<br>
          <a href="mailto:lustre-discuss@lists.lustre.org"
            target="_blank" moz-do-not-send="true">lustre-discuss@lists.lustre.org</a><br>
          <a
            href="http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org"
            rel="noreferrer" target="_blank" moz-do-not-send="true">http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org</a><br>
        </blockquote>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <pre class="moz-quote-pre" wrap="">_______________________________________________
lustre-discuss mailing list
<a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a>
<a class="moz-txt-link-freetext" href="http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org">http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org</a>
</pre>
    </blockquote>
  </body>
</html>