<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>The individual LUN looks good but the controller is showing
      amber, which is confusing us.  However other LUN's going through
      that controller are mounting fine.</p>
    <p>-Paul Edmon-<br>
    </p>
    <div class="moz-cite-prefix">On 7/20/2022 3:08 PM, Colin Faber
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAJcXmBmg65+YUF6vgWOew7wD05btuVz5nDJqwaByehzMcxv+YA@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="auto">raid check?</div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Wed, Jul 20, 2022, 12:41 PM
          Paul Edmon <<a href="mailto:pedmon@cfa.harvard.edu"
            moz-do-not-send="true" class="moz-txt-link-freetext">pedmon@cfa.harvard.edu</a>>
          wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0 0 0
          .8ex;border-left:1px #ccc solid;padding-left:1ex">
          <div>
            <p>[root@holylfs02oss06 ~]# mount -t ldiskfs
              /dev/mapper/mpathd /mnt/holylfs2-OST001f<br>
              mount: wrong fs type, bad option, bad superblock on
              /dev/mapper/mpathd,<br>
                     missing codepage or helper program, or other error<br>
              <br>
                     In some cases useful info is found in syslog - try<br>
                     dmesg | tail or so.<br>
              <br>
            </p>
            <p>e2fsck did not look good:</p>
            <p>[root@holylfs02oss06 ~]# less OST001f.out<br>
              ext2fs_check_desc: Corrupt group descriptor: bad block for
              block bitmap<br>
              e2fsck: Group descriptors look bad... trying backup
              blocks...<br>
              MMP interval is 10 seconds and total wait time is 42
              seconds. Please wait...<br>
              Superblock needs_recovery flag is clear, but journal has
              data.<br>
              Recovery flag not set in backup superblock, so running
              journal anyway.<br>
              Clear journal? no<br>
              <br>
              Block bitmap for group 8128 is not in group.  (block
              3518518062363072290)<br>
              Relocate? no<br>
              <br>
              Inode bitmap for group 8128 is not in group.  (block
              12235298632209565410)<br>
              Relocate? no<br>
              <br>
              Inode table for group 8128 is not in group.  (block
              17751685088477790304)<br>
              WARNING: SEVERE DATA LOSS POSSIBLE.<br>
              Relocate? no<br>
              <br>
              Block bitmap for group 8129 is not in group.  (block
              2193744380193356980)<br>
              Relocate? no<br>
              <br>
              Inode bitmap for group 8129 is not in group.  (block
              4102707059848926418)<br>
              Relocate? no<br>
            </p>
            <p>It continues at length like that.</p>
            <p>-Paul Edmon-<br>
            </p>
            <div>On 7/20/2022 2:31 PM, Colin Faber wrote:<br>
            </div>
            <blockquote type="cite">
              <div dir="auto">Can you mount the target directly with -t
                ldiskfs ?
                <div dir="auto"><br>
                </div>
                <div dir="auto">Also what does e2fsck report?</div>
              </div>
              <br>
              <div class="gmail_quote">
                <div dir="ltr" class="gmail_attr">On Wed, Jul 20, 2022,
                  11:48 AM Paul Edmon via lustre-discuss <<a
                    href="mailto:lustre-discuss@lists.lustre.org"
                    target="_blank" rel="noreferrer"
                    moz-do-not-send="true" class="moz-txt-link-freetext">lustre-discuss@lists.lustre.org</a>>
                  wrote:<br>
                </div>
                <blockquote class="gmail_quote" style="margin:0 0 0
                  .8ex;border-left:1px #ccc solid;padding-left:1ex">We
                  have a filesystem that we have running Lustre 2.10.4
                  in HA mode using <br>
                  IML.  One of our OST's had some disk failures and
                  after reconstruction <br>
                  of the RAID set it won't remount but gives:<br>
                  <br>
                  [root@holylfs02oss06 ~]# mount -t lustre
                  /dev/mapper/mpathd <br>
                  /mnt/holylfs2-OST001f<br>
                  Failed to initialize ZFS library: 256<br>
                  mount.lustre: missing option mgsnode=<nid><br>
                  <br>
                  The weird thing is that we didn't build this with ZFS,
                  the devices are <br>
                  all ldiskfs.  We suspect some of the data is corrupt
                  on the disk but we <br>
                  were wondering if anyone had seen this error before
                  and if there was a <br>
                  solution.<br>
                  <br>
                  -Paul Edmon-<br>
                  <br>
                  _______________________________________________<br>
                  lustre-discuss mailing list<br>
                  <a href="mailto:lustre-discuss@lists.lustre.org"
                    rel="noreferrer noreferrer" target="_blank"
                    moz-do-not-send="true" class="moz-txt-link-freetext">lustre-discuss@lists.lustre.org</a><br>
                  <a
                    href="http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org"
                    rel="noreferrer noreferrer noreferrer"
                    target="_blank" moz-do-not-send="true"
                    class="moz-txt-link-freetext">http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org</a><br>
                </blockquote>
              </div>
            </blockquote>
          </div>
        </blockquote>
      </div>
    </blockquote>
  </body>
</html>