<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html;
      charset=windows-1252">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>There were actually several:</p>
    <p>On an OSS:</p>
    <p>[447314.138709] BUG: unable to handle kernel NULL pointer
      dereference at 0000000000000020<br>
      [543262.189674] BUG: unable to handle kernel NULL pointer
      dereference at           (null)<br>
      [16397.115830] BUG: unable to handle kernel NULL pointer
      dereference at           (null)<br>
    </p>
    <p><br>
    </p>
    <p>On 2 separate clients:</p>
    <p>[65404.590906] BUG: unable to handle kernel NULL pointer
      dereference at           (null)<br>
      [72095.972732] BUG: unable to handle kernel paging request at
      0000002029b0e000<br>
      <br>
    </p>
    <p>Brian Andrus<br>
    </p>
    <p><br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 8/4/2017 10:49 AM, Patrick Farrell
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:BN6PR1101MB21321D10C28F97910A7B1311CBB60@BN6PR1101MB2132.namprd11.prod.outlook.com">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1252">
      <meta name="Generator" content="Microsoft Exchange Server">
      <!-- converted from text -->
      <style><!-- .EmailQuote { margin-left: 1pt; padding-left: 4pt; border-left: #800000 2px solid; } --></style>
      <meta content="text/html; charset=UTF-8">
      <style type="text/css" style="">
<!--
p
        {margin-top:0;
        margin-bottom:0}
-->
</style>
      <div dir="ltr">
        <div id="x_divtagdefaultwrapper" dir="ltr"
          style="font-size:12pt; color:#000000;
          font-family:Calibri,Helvetica,sans-serif">
          <p>Brian,</p>
          <br>
          <p>What is the actual crash?  Null pointer, failed
            assertion/LBUG...?  Probably just a few more lines back in
            the log would show that.</p>
          <p><br>
          </p>
          <p>Also, Lustre 2.10 has been released, you might benefit from
            switching to that.  There are almost certainly more bugs in
            this pre-2.10 development version you're running than in the
            release.
          </p>
          <p><br>
          </p>
          <p>- Patrick<br>
          </p>
        </div>
        <hr tabindex="-1" style="display:inline-block; width:98%">
        <div id="x_divRplyFwdMsg" dir="ltr"><font style="font-size:11pt"
            face="Calibri, sans-serif" color="#000000"><b>From:</b>
            lustre-discuss
            <a class="moz-txt-link-rfc2396E" href="mailto:lustre-discuss-bounces@lists.lustre.org"><lustre-discuss-bounces@lists.lustre.org></a> on behalf of
            Brian Andrus <a class="moz-txt-link-rfc2396E" href="mailto:toomuchit@gmail.com"><toomuchit@gmail.com></a><br>
            <b>Sent:</b> Friday, August 4, 2017 12:12:59 PM<br>
            <b>To:</b> <a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a><br>
            <b>Subject:</b> [lustre-discuss] nodes crash during ior test</font>
          <div> </div>
        </div>
      </div>
      <font size="2"><span style="font-size:10pt;">
          <div class="PlainText">All,<br>
            <br>
            I am trying to run some ior benchmarking on a small system.<br>
            <br>
            It only has 2 OSSes.<br>
            I have been having some trouble where one of the clients
            will reboot and <br>
            do a crash dump somewhat arbitrarily. The runs will work
            most of the <br>
            time, but every 5 or so times, a client reboots and it is
            not always the <br>
            same client.<br>
            <br>
            The call trace seems to point to lnet:<br>
            <br>
            <br>
            72095.973865] Call Trace:<br>
            [72095.973892]  [<ffffffffa070e856>] ?
            cfs_percpt_unlock+0x36/0xc0 [libcfs]<br>
            [72095.973936]  [<ffffffffa0779851>] <br>
            lnet_return_tx_credits_locked+0x211/0x480 [lnet]<br>
            [72095.973973]  [<ffffffffa076c770>]
            lnet_msg_decommit+0xd0/0x6c0 [lnet]<br>
            [72095.974006]  [<ffffffffa076d0f9>]
            lnet_finalize+0x1e9/0x690 [lnet]<br>
            [72095.974037]  [<ffffffffa06baf45>]
            ksocknal_tx_done+0x85/0x1c0 [ksocklnd]<br>
            [72095.974068]  [<ffffffffa06c3277>]
            ksocknal_handle_zcack+0x137/0x1e0 <br>
            [ksocklnd]<br>
            [72095.974101]  [<ffffffffa06becf1>] <br>
            ksocknal_process_receive+0x3a1/0xd90 [ksocklnd]<br>
            [72095.974134]  [<ffffffffa06bfa6e>]
            ksocknal_scheduler+0xee/0x670 <br>
            [ksocklnd]<br>
            [72095.974165]  [<ffffffff810b1b20>] ?
            wake_up_atomic_t+0x30/0x30<br>
            [72095.974193]  [<ffffffffa06bf980>] ?
            ksocknal_recv+0x2a0/0x2a0 [ksocklnd]<br>
            [72095.974222]  [<ffffffff810b0a4f>] kthread+0xcf/0xe0<br>
            [72095.974244]  [<ffffffff810b0980>] ?
            kthread_create_on_node+0x140/0x140<br>
            [72095.974272]  [<ffffffff81697758>]
            ret_from_fork+0x58/0x90<br>
            [72095.974296]  [<ffffffff810b0980>] ?
            kthread_create_on_node+0x140/0x140<br>
            <br>
            I am currently using lustre 2.9.59_15_g107b2cb built for
            kmod<br>
            <br>
            Is there something I can do to track this down and hopefully
            remedy it?<br>
            <br>
            Brian Andrus<br>
            <br>
            _______________________________________________<br>
            lustre-discuss mailing list<br>
            <a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a><br>
            <a
              href="http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org"
              moz-do-not-send="true">http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org</a><br>
          </div>
        </span></font>
    </blockquote>
    <br>
  </body>
</html>