<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <div class="moz-cite-prefix">Le 13/12/2016 à 14:33, Patrick Farrell

      a écrit :<br>

    </div>

    <blockquote

cite="mid:CY4PR11MB1751D8EF7926721056AC3FFFCB9B0@CY4PR11MB1751.namprd11.prod.outlook.com"

      type="cite">

      <meta http-equiv="Content-Type" content="text/html;

        charset=windows-1252">

      <br>

      <br>

      Quentin,<br>

      <br>

      I suspect that the pages are only maintained during the duration

      of an IO, then discarded.  I haven't dug in to the exact mechanics

      of it, but when caches are disabled, the key thing is no CACHING

      occurs, i.e., nothing can be read from the cache.  So, I assume,

      these pages you see are transiently present for purposes of

      performing the IO.  (The data from the disk has to go somewhere.)<br>

      <br>

      - Patrick<br>

      <hr style="display:inline-block;width:98%" tabindex="-1">

      <div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt"

          face="Calibri, sans-serif" color="#000000"><b>From:</b>

          lustre-devel <a class="moz-txt-link-rfc2396E" href="mailto:lustre-devel-bounces@lists.lustre.org"><lustre-devel-bounces@lists.lustre.org></a> on

          behalf of <a class="moz-txt-link-abbreviated" href="mailto:quentin.bouget@cea.fr">quentin.bouget@cea.fr</a> <a class="moz-txt-link-rfc2396E" href="mailto:quentin.bouget@cea.fr"><quentin.bouget@cea.fr></a><br>

          <b>Sent:</b> Tuesday, December 13, 2016 4:02:06 AM<br>

          <b>To:</b> <a class="moz-txt-link-abbreviated" href="mailto:lustre-devel@lists.lustre.org">lustre-devel@lists.lustre.org</a><br>

          <b>Subject:</b> [lustre-devel] caching in Lustre</font>

        <div> </div>

      </div>

      <div>

        <p>Hi all,</p>

        <p>I am currently trying to work out how Lustre behaves when

          both "read_cache" and "writethrough_cache" are disabled. What

          I particularly want to know is how does writing to the related

          proc files influence the cache policy?</p>

        <p>To me (and perf_event reports it too on a 2.7 setup), the

          code always gets cache pages (using find_or_create_page() in

          "lustre/osd-ldiskfs/osd_io.c") even with both cache parameters

          set to 0.<br>

          After that, if caching is disabled, a call to

          generic_error_remove_page() is issued on the pages that were

          allocated. This functions is described in the kernel sources

          like this:<br>

        </p>

        <pre><b><i>/*

 * Used to get rid of pages on hardware memory corruption.

 */

int generic_error_remove_page(struct adress_space *mapping, struct page *page)

</i></b></pre>

        <p>This does not seem to be the "natural" call to use, but

          anyway, I can live with that.<br>

          What really bothers me is that the behaviour of Lustre from

          this point looks exactly the same as if cache was enabled. I

          can't find a single branching point that handles things

          differently: pages are kmapped, written to/read from,

          kunmmaped... I am probably missing something, but I can't

          figure out what. Could someone please point me in the right

          direction?</p>

        <p>The functions I find the most relevant to study are:<br>

          <font size="-1">"lustre/ofd/ofd_io.c":</font><br>

              ofd_preprw() -> ofd_preprw_read() / ofd_preprw_write()</p>

        <p>their counterparts:<br>

          <font size="-1">"lustre/ofd/ofd_io.c":</font><br>

              ofd_commitrw() -> ofd_commitrw_read() /

          ofd_commitrw_write()</p>

        <p>the handlers of the proc files

          "/proc/fs/lustre/obdfilter/*/{read,writethrough}_cache_enable":<br>

          <font size="-1">"lustre/osd-ldiskfs/osd_lproc.c":</font><br>

              ldiskfs_osd_cache_seq_write(),

          ldiskfs_osd_wcache_seq_write()</p>

        <p>and the only places that use the variables set by the proc

          files (where generic_error_remove_page() is used):<br>

          <font size="-1">"lustre/osd-ldiskfs/osd_io.c":</font><br>

              osd_read_prep(), osd_write_prep()</p>

        <p>(I suspect I am missing something really important about what

          generic_error_remove_page() does)</p>

        <p><br>

        </p>

        <p>Regards</p>

        <p>Quentin Bouget</p>

      </div>

    </blockquote>

    <p>Alright, so data has to go somewhere. That makes sense. And those

      pages are probably discarded as soon as the last reference is

      dropped (surely on IO completion). So Lustre allocates cache

      pages, that it sort of converts into "regular" buffers and uses as

      such. So indeed it mimics a "no_cache" policy... Then, I need to

      elaborate...<br>

    </p>

    <p>I am running obdfilter on Lustre with SSD disks as OST(s). The

      performance of Lustre seems directly related to the number of

      threads I configure obdfilter to spawn (the more threads there

      are, the better).<br>

      But there is a catch: the more threads I spawn, the more they

      contends on locks inside the pagecache allocations functions. I

      reach 100% CPU usage before the theoric throughput of the disks.</p>

    <p>So I want to see if disabling cache in Lustre provides a better

      ratio of CPU usage over IO throughput. From there I get confused

      when I notice that there are still as many calls to

      find_or_create_page() as with cache enabled, and that my CPU

      consumption is still maxed out. (I now understand why

      find_or_create_page() still gets called)<br>

    </p>

    <p>Looking at the code of find_or_create_page() it seems to do

      mainly 2 things: allocate a page, then add it to the LRU list.

      Yet, removing the page from the LRU list is quite a costly

      operation. I would have thought that generic_error_remove_page()

      would take care of it, efficiently, right after initialization,

      but perf_event shows me that it actually happens when the IO

      completes -- when the page is released -- and that half the time

      of one execution of obdfilter-survey is spent spinning there.</p>

    <p>Would it be imaginable that Lustre used another page allocation

      function when cache is disabled? Maybe it is even possible to use

      buffers directly from the ptlrpc requests? Does someone see

      another way out of this? </p>

    <p>I tried to be as clear as possible, but I can try again if need

      be. =)</p>

    <p><br>

    </p>

    <p>Regards,</p>

    <p>Quentin Bouget</p>

  </body>

</html>