<html>

  <head>


    <meta http-equiv="content-type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    All,<br>

    <br>

    I have an application that writes a 100GB file forwards, and then

    begins a sequence of reading a 70 GB section of the file forwards

    and backwards. At some point in the run, <br>

    not always at the same point, the read performance degrades

    significantly.  The initial forward reads are about 1.3 GB/s.  The

    backwards reads about 300 MB/s.  In an instant, <br>

    the forward read performance drops to 2.8 MB/s.  From about 250

    seconds on, this is the only file that is being read or written by

    the application, running on a dedicated client node.<br>

    The file has a stripe count of 4, and stripe size of 512KB.    If

    the stripe count is changed to 1, this behavior does not present

    itself.  The cpu usage is minimal during the period of degraded

    performance.  <br>

    The LNET traffic is also about 2.8 MB/s during the period of

    degraded performance.  The system has 64GB of memory, meaning Lustre

    can not cache the entire 70GB active set of the file that is being

    read.  <br>

    The Lustre client version is 2.9.0.<br>

    <br>

    Any ideas what could be causing this?  What should I be watching in

    the /proc/fs/lustre file system to find some clues?<br>

    <br>

    The behavior is depicted in the image below, which shows the file

    position as a function of wall clock time.  The writes and reads are

    of size 512KB.<br>

    <br>

    Thanks,<br>

    <br>

    John<br>

    <br>

    <br>

    <br>

    <img src="cid:part1.7E2D78BA.3432AFCC@iodoctors.com" alt="">

    <pre class="moz-signature" cols="72">-- 

I/O Doctors, LLC

507-766-0378

<a class="moz-txt-link-abbreviated" href="mailto:bauerj@iodoctors.com">bauerj@iodoctors.com</a></pre>

  </body>

</html>