<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
</head>
<body>
<p>Are you using any Lustre monitoring tools? We use ltop from the
LMT package (<a class="moz-txt-link-freetext" href="https://github.com/LLNL/lmt">https://github.com/LLNL/lmt</a>) and during that time of
high load you could see if there are bursts of IOPs coming in.
Running iotop or iostat might also provide some insight into the
load if based on I/O.</p>
<p>Cameron<br>
</p>
<div class="moz-cite-prefix">On 5/28/20 8:37 AM, Peeples, Heath
wrote:<br>
</div>
<blockquote type="cite" cite="mid:BN7PR01MB384264E2EE24147C9DD1A6F29C8E0@BN7PR01MB3842.prod.exchangelabs.com">
<meta name="Generator" content="Microsoft Word 15 (filtered
medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal">I have 2 MDSs and periodically on one of
them (either at one time or another) peak above 300, causing
the file system to basically stop. This lasts for a few
minutes and then goes away. We can’t identify any one user
running jobs at the times we see this, so it’s hard to
pinpoint this on a user doing something to cause it. Could
anyone point me in the direction of how to begin debugging
this? Any help is greatly appreciated.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Heath<o:p></o:p></p>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
lustre-discuss mailing list
<a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a>
<a class="moz-txt-link-freetext" href="http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org">http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org</a>
</pre>
</blockquote>
</body>
</html>