<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>Hello Andreas,</p>
    <p>Thank you for your prompt response. In the end I was also
      thinking about a hardware issue. I will try to play with the DIMMs
      and will be sure to get back to you if the the issue is resolved.</p>
    <p>Cheers, Julien.<br>
    </p>
    <div class="moz-cite-prefix">Le 30/10/2021 à 02:46, Andreas Dilger a
      écrit :<br>
    </div>
    <blockquote type="cite"
      cite="mid:1A43B64B-7498-44B3-AB1D-69FFAA195F75@ddn.com">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      On Oct 29, 2021, at 07:39, Julien Rey via lustre-discuss <<a
        href="mailto:lustre-discuss@lists.lustre.org" class=""
        moz-do-not-send="true">lustre-discuss@lists.lustre.org</a>>
      wrote:<br class="">
      <div>
        <blockquote type="cite" class=""><br
            class="Apple-interchange-newline">
          <div class="">
            <div class="">Hello,<br class="">
              <br class="">
              This may not be related directly to Lustre, but here's
              what I get when I try to mount our Lustre filesystem on
              one of our compute node running CentOS 7:<br class="">
              <br class="">
              <br class="">
              Oct 29 14:30:20 gpu-node8 kernel: SLUB: Unable to allocate
              memory on node -1 (gfp=0x8050)<br class="">
            </div>
          </div>
        </blockquote>
        <div><br class="">
        </div>
        There doesn't look to be anything "wrong" here, -1 means "no
        specific node", and the GFP mask is __GFP_ZERO | __GFP_IO |
        __GFP_WAIT for this kernel.</div>
      <div><br class="">
      </div>
      <div>One time I saw problems like this, it was because all the
        DIMMs were installed on one socket of a dual-socket NUMA
        motherboard, and no memory was available on the other socket,
        but only some allocations failed.</div>
      <div><br class="">
      </div>
      <div>Cheers, Andreas</div>
      <div><br class="">
        <blockquote type="cite" class="">
          <div class="">
            <div class="">Oct 29 14:30:20 gpu-node8 kernel:  cache:
              dm_rq_target_io, object size: 136, buffer size: 136,
              default order: 0, min order: 0<br class="">
              Oct 29 14:30:20 gpu-node8 kernel:  node 1: slabs: 2, objs:
              60, free: 0<br class="">
              Oct 29 14:30:20 gpu-node8 kernel: LustreError:
              3097:0:(niobuf.c:994:ptlrpc_register_rqbd()) LNetMDAttach
              failed: -12;<br class="">
              Oct 29 14:30:20 gpu-node8 kernel: LustreError:
              3097:0:(service.c:2551:ptlrpc_main()) Failed to post rqbd
              for ldlm_cbd on CPT 0: -1<br class="">
              Oct 29 14:30:20 gpu-node8 kernel: LustreError:
              3091:0:(service.c:2917:ptlrpc_start_threads()) cannot
              start ldlm_cb thread #0_0: rc -1<br class="">
              Oct 29 14:30:20 gpu-node8 kernel: LustreError:
              3091:0:(service.c:837:ptlrpc_register_service()) Failed to
              start threads for service ldlm_cbd: -1<br class="">
              Oct 29 14:30:20 gpu-node8 kernel: LustreError:
              3091:0:(ldlm_lockd.c:3077:ldlm_setup()) failed to start
              service<br class="">
              Oct 29 14:30:20 gpu-node8 kernel: LustreError:
              3091:0:(ldlm_lib.c:462:client_obd_setup()) ldlm_get_ref
              failed: -1<br class="">
              Oct 29 14:30:20 gpu-node8 kernel: LustreError:
              3091:0:(obd_config.c:559:class_setup()) setup
              MGC10.0.1.70@tcp failed (-1)<br class="">
              Oct 29 14:30:20 gpu-node8 kernel: LustreError:
              3091:0:(obd_mount.c:202:lustre_start_simple())
              MGC10.0.1.70@tcp setup error -1<br class="">
              Oct 29 14:30:20 gpu-node8 kernel: LustreError:
              3091:0:(obd_mount.c:1608:lustre_fill_super()) Unable to
              mount (-1)<br class="">
              <br class="">
              <br class="">
              I've been scratching my head on this one because this
              could just be a kernel bug but we have 3 other identical
              servers running the exact same versions of CentOS 7 and
              Lustre client and I got no problem with them.<br class="">
              <br class="">
              Some more info:<br class="">
              <br class="">
              [root@gpu-node8 ~]# uname -r<br class="">
              3.10.0-1160.el7.x86_64<br class="">
              <br class="">
              [root@gpu-node8 ~]# lctl --version<br class="">
              lctl 2.12.7<br class="">
              <br class="">
              [root@gpu-node8 ~]# vmstat -m |grep dm_rq_target_io<br
                class="">
              dm_rq_target_io              60     60    136     30<br
                class="">
              <br class="">
              [root@gpu-node8 ~]# free -h<br class="">
                            total        used        free      shared
              buff/cache   available<br class="">
              Mem:            31G        1.4G         29G         10M
              117M         29G<br class="">
              Swap:           15G          0B         15G<br class="">
              <br class="">
              <br class="">
              I've been playing with the sysctl parameters but I don't
              really know what I'm doing and got no result anyway:<br
                class="">
              <br class="">
              sysctl vm.overcommit_memory=1<br class="">
              <br class="">
              sysctl vm.min_free_kbytes=90112<br class="">
              <br class="">
              sysctl vm.overcommit_kbytes=90112<br class="">
              <br class="">
              <br class="">
              Any help would be greetly appreciated.<br class="">
              <br class="">
              Thanks!<br class="">
              <br class="">
              -- <br class="">
              Julien REY<br class="">
              <br class="">
              Plate-forme RPBS<br class="">
              Modélisation Computationnelle des Interactions
              Protéines-Ligand (CMPLI)<br class="">
              Université de Paris<br class="">
              tel : 01 57 27 83 95<br class="">
              <br class="">
              _______________________________________________<br
                class="">
              lustre-discuss mailing list<br class="">
              <a href="mailto:lustre-discuss@lists.lustre.org" class=""
                moz-do-not-send="true">lustre-discuss@lists.lustre.org</a><br
                class="">
<a class="moz-txt-link-freetext" href="http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org">http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org</a><br
                class="">
            </div>
          </div>
        </blockquote>
      </div>
      <br class="">
      <div class="">
        <div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0,
          0, 0); letter-spacing: normal; text-align: start; text-indent:
          0px; text-transform: none; white-space: normal; word-spacing:
          0px; -webkit-text-stroke-width: 0px; text-decoration: none;
          word-wrap: break-word; -webkit-nbsp-mode: space; line-break:
          after-white-space;" class="">
          <div dir="auto" style="caret-color: rgb(0, 0, 0); color:
            rgb(0, 0, 0); letter-spacing: normal; text-align: start;
            text-indent: 0px; text-transform: none; white-space: normal;
            word-spacing: 0px; -webkit-text-stroke-width: 0px;
            text-decoration: none; word-wrap: break-word;
            -webkit-nbsp-mode: space; line-break: after-white-space;"
            class="">
            <div dir="auto" style="caret-color: rgb(0, 0, 0); color:
              rgb(0, 0, 0); letter-spacing: normal; text-align: start;
              text-indent: 0px; text-transform: none; white-space:
              normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;
              text-decoration: none; word-wrap: break-word;
              -webkit-nbsp-mode: space; line-break: after-white-space;"
              class="">
              <div dir="auto" style="caret-color: rgb(0, 0, 0); color:
                rgb(0, 0, 0); letter-spacing: normal; text-align: start;
                text-indent: 0px; text-transform: none; white-space:
                normal; word-spacing: 0px; -webkit-text-stroke-width:
                0px; text-decoration: none; word-wrap: break-word;
                -webkit-nbsp-mode: space; line-break:
                after-white-space;" class="">
                <div dir="auto" style="caret-color: rgb(0, 0, 0); color:
                  rgb(0, 0, 0); letter-spacing: normal; text-align:
                  start; text-indent: 0px; text-transform: none;
                  white-space: normal; word-spacing: 0px;
                  -webkit-text-stroke-width: 0px; text-decoration: none;
                  word-wrap: break-word; -webkit-nbsp-mode: space;
                  line-break: after-white-space;" class="">
                  <div dir="auto" style="caret-color: rgb(0, 0, 0);
                    color: rgb(0, 0, 0); letter-spacing: normal;
                    text-align: start; text-indent: 0px; text-transform:
                    none; white-space: normal; word-spacing: 0px;
                    -webkit-text-stroke-width: 0px; text-decoration:
                    none; word-wrap: break-word; -webkit-nbsp-mode:
                    space; line-break: after-white-space;" class="">
                    <div>Cheers, Andreas</div>
                    <div>--</div>
                    <div>Andreas Dilger</div>
                    <div>Lustre Principal Architect</div>
                    <div>Whamcloud</div>
                    <div><br class="">
                    </div>
                    <div><br class="">
                    </div>
                    <div><br class="">
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
          <br class="Apple-interchange-newline">
        </div>
        <br class="Apple-interchange-newline">
        <br class="Apple-interchange-newline">
      </div>
      <br class="">
    </blockquote>
    <pre class="moz-signature" cols="72">-- 
Julien REY

Plate-forme RPBS
Modélisation Computationnelle des Interactions Protéines-Ligand (CMPLI)
Université de Paris
tel : 01 57 27 83 95</pre>
  </body>
</html>