<!DOCTYPE html>

<html>

  <head>

    <meta http-equiv="content-type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>All,</p>

    <p>I am back to trying to emulate Hybrid I/O  from user space, doing

      direct and buffered I/O to the same file concurrently.  I open a

      file twice, once with O_DIRECT, and once without.  Note that you

      will see 2 different file names involved, buffered.dat and

      direct.dat.  direct.dat is a symlink to buffered.dat and this is

      done so my tool can more easily display the direct and non-direct

      I/O differently.  The file has striping of

      512M@4{100,101,102,103}x32M<ssd-pool +

      EOF@4{104,105,106,107}x32M<ssd-pool.  The application first

      writes 512M ( 32M per write ) to only the first PFL component

      using non-direct fd.  Then the application writes 512M ( 32M per

      write ) alternating between the direct fd and non-direct fd.  The

      very first write ( using direct ) into the 2nd component triggers

      the dump of the entire first component from buffer cache.  From

      that point on the 2 OSC that handle the non-direct writes

      accumulate cache.  The 2 OSC that handle the direct writes

      accumulate no cache.  My question: Why does Lustre dump the 1st

      component from buffer cache?  The 1st and 2nd component do not

      even share OSCs.  Lustre is has no problem dealing with direct and

      non-direct I/O in the same component (2nd component in this

      case).  To me it would seem that if Lustre can correctly buffer

      direct and non-direct in the same component, it should be able to

      correctly buffer direct and non-direct in multiple components.  My

      ultimate goal is to have the first, and smaller component, remain

      cached, and the remainder of the file use direct I/O, but as soon

      as I do a direct I/O, I lose all my buffer cache.</p>

    <p>The top frame of the plot is the amount of cache used by each OSC

      versus time. The bottom frame of the plot is the File Position

      Activity versus time.  Next to each pwrite64() depicted, I

      indicate which OSC is being written to.  I have also colored the

      pwrite64()s by whether they used the direct fd (green) or

      non-direct fd(red).  As soon as the 2nd PFL component is touched

      by a direct write, that write waits until the OSCs of the first

      PFL component dump all their cache.</p>

    <p>John</p>

    <p><font size="6">Image 1 :</font></p>

    <p><img moz-do-not-send="false"

        src="cid:part1.Hcv00XOh.0605yO7N@iodoctors.com"

alt="https://www.dropbox.com/scl/fi/d7seezfj0gtxo1y7lzpvy/split_direct.png?rlkey=0sfo1erxo5ua1aef5ijfc81jx&st=pxb0qnts&dl=0"

        width="1374" height="873"></p>

  </body>

</html>