[lustre-discuss] Unable to mount new OST

David Cohen cdavid at physics.technion.ac.il
Tue Jul 6 22:09:56 PDT 2021


Hi Jeff,
The logs are clear, the new OST is a brand new DDN pool, no alerts on the
physical storage, and no indications of malfunctioning disks in the
machines logs

After reboot the device dm changes:
ls -la /dev/mapper/OST0051
lrwxrwxrwx 1 root root 8 Jul  6 07:59 /dev/mapper/OST0051 -> ../dm-30

ls /sys/block/dm-30/slaves
sdag  sdbm  sdcs  sddy

[root at oss03 ~]# grep sdag /var/log/messages
Jul  4 05:50:45 oss03 kernel: sd 12:0:0:92: [sdag] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  4 05:50:45 oss03 kernel: sd 12:0:0:92: [sdag] Write Protect is off
Jul  4 05:50:45 oss03 kernel: sd 12:0:0:92: [sdag] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  4 05:50:45 oss03 kernel: sd 12:0:0:92: [sdag] Attached SCSI disk
Jul  4 05:50:46 oss03 multipathd: sdag: add path (uevent)
Jul  4 05:50:46 oss03 multipathd: sdag [66:0]: path added to devmap OST0051
Jul  4 06:01:30 oss03 kernel: sd 10:0:0:92: [sdag] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  4 06:01:30 oss03 kernel: sd 10:0:0:92: [sdag] Write Protect is off
Jul  4 06:01:30 oss03 kernel: sd 10:0:0:92: [sdag] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  4 06:01:31 oss03 kernel: sd 10:0:0:92: [sdag] Attached SCSI disk
Jul  4 06:01:31 oss03 multipathd: sdag: add path (uevent)
Jul  4 06:01:31 oss03 multipathd: sdag [66:0]: path added to devmap OST0051
Jul  4 06:25:21 oss03 kernel: sd 12:0:0:92: [sdag] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  4 06:25:21 oss03 kernel: sd 12:0:0:92: [sdag] Write Protect is off
Jul  4 06:25:21 oss03 kernel: sd 12:0:0:92: [sdag] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  4 06:25:21 oss03 kernel: sd 12:0:0:92: [sdag] Attached SCSI disk
Jul  4 06:25:22 oss03 multipathd: sdag: add path (uevent)
Jul  4 06:25:22 oss03 multipathd: sdag [66:0]: path added to devmap OST0051
Jul  4 07:21:47 oss03 kernel: sd 10:0:0:92: [sdag] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  4 07:21:47 oss03 kernel: sd 10:0:0:92: [sdag] Write Protect is off
Jul  4 07:21:47 oss03 kernel: sd 10:0:0:92: [sdag] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  4 07:21:47 oss03 kernel: sd 10:0:0:92: [sdag] Attached SCSI disk
Jul  4 07:21:48 oss03 multipathd: sdag: add path (uevent)
Jul  4 07:21:48 oss03 multipathd: sdag [66:0]: path added to devmap OST0051
Jul  6 07:59:06 oss03 kernel: sd 10:0:0:92: [sdag] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  6 07:59:06 oss03 kernel: sd 10:0:0:92: [sdag] Write Protect is off
Jul  6 07:59:06 oss03 kernel: sd 10:0:0:92: [sdag] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  6 07:59:06 oss03 kernel: sd 10:0:0:92: [sdag] Attached SCSI disk
Jul  6 07:59:06 oss03 multipathd: sdag: add path (uevent)
Jul  6 07:59:06 oss03 multipathd: sdag [66:0]: path added to devmap OST0051
[root at oss03 ~]# grep sdbm /var/log/messages
Jul  4 05:50:49 oss03 kernel: sd 13:0:0:92: [sdbm] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  4 05:50:49 oss03 kernel: sd 13:0:0:92: [sdbm] Write Protect is off
Jul  4 05:50:49 oss03 kernel: sd 13:0:0:92: [sdbm] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  4 05:50:49 oss03 kernel: sd 13:0:0:92: [sdbm] Attached SCSI disk
Jul  4 05:50:49 oss03 multipathd: sdbm: add path (uevent)
Jul  4 05:50:49 oss03 multipathd: sdbm [68:0]: path added to devmap OST0051
Jul  4 06:01:34 oss03 kernel: sd 11:0:0:92: [sdbm] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  4 06:01:34 oss03 kernel: sd 11:0:0:92: [sdbm] Write Protect is off
Jul  4 06:01:34 oss03 kernel: sd 11:0:0:92: [sdbm] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  4 06:01:34 oss03 kernel: sd 11:0:0:92: [sdbm] Attached SCSI disk
Jul  4 06:01:34 oss03 multipathd: sdbm: add path (uevent)
Jul  4 06:01:34 oss03 multipathd: sdbm [68:0]: path added to devmap OST0051
Jul  4 06:25:25 oss03 kernel: sd 13:0:0:92: [sdbm] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  4 06:25:25 oss03 kernel: sd 13:0:0:92: [sdbm] Write Protect is off
Jul  4 06:25:25 oss03 kernel: sd 13:0:0:92: [sdbm] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  4 06:25:25 oss03 kernel: sd 13:0:0:92: [sdbm] Attached SCSI disk
Jul  4 06:25:25 oss03 multipathd: sdbm: add path (uevent)
Jul  4 06:25:25 oss03 multipathd: sdbm [68:0]: path added to devmap OST0051
Jul  4 07:21:50 oss03 kernel: sd 11:0:0:92: [sdbm] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  4 07:21:50 oss03 kernel: sd 11:0:0:92: [sdbm] Write Protect is off
Jul  4 07:21:50 oss03 kernel: sd 11:0:0:92: [sdbm] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  4 07:21:50 oss03 kernel: sd 11:0:0:92: [sdbm] Attached SCSI disk
Jul  4 07:21:50 oss03 multipathd: sdbm: add path (uevent)
Jul  4 07:21:50 oss03 multipathd: sdbm [68:0]: path added to devmap OST0051
Jul  6 07:59:09 oss03 kernel: sd 11:0:0:92: [sdbm] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  6 07:59:09 oss03 kernel: sd 11:0:0:92: [sdbm] Write Protect is off
Jul  6 07:59:09 oss03 kernel: sd 11:0:0:92: [sdbm] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  6 07:59:09 oss03 kernel: sd 11:0:0:92: [sdbm] Attached SCSI disk
Jul  6 07:59:09 oss03 multipathd: sdbm: add path (uevent)
Jul  6 07:59:09 oss03 multipathd: sdbm [68:0]: path added to devmap OST0051
[root at oss03 ~]# grep sdcs /var/log/messages
Jul  4 05:51:16 oss03 kernel: sd 14:0:0:92: [sdcs] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  4 05:51:16 oss03 kernel: sd 14:0:0:92: [sdcs] Write Protect is off
Jul  4 05:51:16 oss03 kernel: sd 14:0:0:92: [sdcs] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  4 05:51:16 oss03 kernel: sd 14:0:0:92: [sdcs] Attached SCSI disk
Jul  4 05:51:16 oss03 multipathd: sdcs: add path (uevent)
Jul  4 05:51:16 oss03 multipathd: sdcs [70:0]: path added to devmap OST0051
Jul  4 06:02:04 oss03 kernel: sd 14:0:0:92: [sdcs] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  4 06:02:04 oss03 kernel: sd 14:0:0:92: [sdcs] Write Protect is off
Jul  4 06:02:04 oss03 kernel: sd 14:0:0:92: [sdcs] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  4 06:02:04 oss03 kernel: sd 14:0:0:92: [sdcs] Attached SCSI disk
Jul  4 06:02:04 oss03 multipathd: sdcs: add path (uevent)
Jul  4 06:02:04 oss03 multipathd: sdcs [70:0]: path added to devmap OST0051
Jul  4 06:25:52 oss03 kernel: sd 14:0:0:92: [sdcs] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  4 06:25:52 oss03 kernel: sd 14:0:0:92: [sdcs] Write Protect is off
Jul  4 06:25:52 oss03 kernel: sd 14:0:0:92: [sdcs] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  4 06:25:52 oss03 kernel: sd 14:0:0:92: [sdcs] Attached SCSI disk
Jul  4 06:25:52 oss03 multipathd: sdcs: add path (uevent)
Jul  4 06:25:52 oss03 multipathd: sdcs [70:0]: path added to devmap OST0051
Jul  4 07:22:20 oss03 kernel: sd 14:0:0:92: [sdcs] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  4 07:22:20 oss03 kernel: sd 14:0:0:92: [sdcs] Write Protect is off
Jul  4 07:22:20 oss03 kernel: sd 14:0:0:92: [sdcs] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  4 07:22:20 oss03 kernel: sd 14:0:0:92: [sdcs] Attached SCSI disk
Jul  4 07:22:20 oss03 multipathd: sdcs: add path (uevent)
Jul  4 07:22:20 oss03 multipathd: sdcs [70:0]: path added to devmap OST0051
Jul  6 07:59:39 oss03 kernel: sd 14:0:0:92: [sdcs] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  6 07:59:39 oss03 kernel: sd 14:0:0:92: [sdcs] Write Protect is off
Jul  6 07:59:39 oss03 kernel: sd 14:0:0:92: [sdcs] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  6 07:59:39 oss03 kernel: sd 14:0:0:92: [sdcs] Attached SCSI disk
Jul  6 07:59:39 oss03 multipathd: sdcs: add path (uevent)
Jul  6 07:59:39 oss03 multipathd: sdcs [70:0]: path added to devmap OST0051
[root at oss03 ~]# grep sddy /var/log/messages
Jul  4 05:51:18 oss03 kernel: sd 15:0:0:92: [sddy] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  4 05:51:18 oss03 kernel: sd 15:0:0:92: [sddy] Write Protect is off
Jul  4 05:51:18 oss03 kernel: sd 15:0:0:92: [sddy] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  4 05:51:18 oss03 kernel: sd 15:0:0:92: [sddy] Attached SCSI disk
Jul  4 05:51:18 oss03 multipathd: sddy: add path (uevent)
Jul  4 05:51:18 oss03 multipathd: sddy [128:0]: path added to devmap OST0051
Jul  4 06:02:07 oss03 kernel: sd 15:0:0:92: [sddy] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  4 06:02:07 oss03 kernel: sd 15:0:0:92: [sddy] Write Protect is off
Jul  4 06:02:07 oss03 kernel: sd 15:0:0:92: [sddy] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  4 06:02:07 oss03 kernel: sd 15:0:0:92: [sddy] Attached SCSI disk
Jul  4 06:02:07 oss03 multipathd: sddy: add path (uevent)
Jul  4 06:02:07 oss03 multipathd: sddy [128:0]: path added to devmap OST0051
Jul  4 06:25:54 oss03 kernel: sd 15:0:0:92: [sddy] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  4 06:25:54 oss03 kernel: sd 15:0:0:92: [sddy] Write Protect is off
Jul  4 06:25:54 oss03 kernel: sd 15:0:0:92: [sddy] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  4 06:25:54 oss03 kernel: sd 15:0:0:92: [sddy] Attached SCSI disk
Jul  4 06:25:54 oss03 multipathd: sddy: add path (uevent)
Jul  4 06:25:54 oss03 multipathd: sddy [128:0]: path added to devmap OST0051
Jul  4 07:22:23 oss03 kernel: sd 15:0:0:92: [sddy] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  4 07:22:23 oss03 kernel: sd 15:0:0:92: [sddy] Write Protect is off
Jul  4 07:22:23 oss03 kernel: sd 15:0:0:92: [sddy] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  4 07:22:23 oss03 kernel: sd 15:0:0:92: [sddy] Attached SCSI disk
Jul  4 07:22:23 oss03 multipathd: sddy: add path (uevent)
Jul  4 07:22:23 oss03 multipathd: sddy [128:0]: path added to devmap OST0051
Jul  6 07:59:41 oss03 kernel: sd 15:0:0:92: [sddy] 34863054848 4096-byte
logical blocks: (142 TB/129 TiB)
Jul  6 07:59:41 oss03 kernel: sd 15:0:0:92: [sddy] Write Protect is off
Jul  6 07:59:41 oss03 kernel: sd 15:0:0:92: [sddy] Write cache: enabled,
read cache: enabled, supports DPO and FUA
Jul  6 07:59:41 oss03 kernel: sd 15:0:0:92: [sddy] Attached SCSI disk
Jul  6 07:59:42 oss03 multipathd: sddy: add path (uevent)
Jul  6 07:59:42 oss03 multipathd: sddy [128:0]: path added to devmap OST0051

On Wed, Jul 7, 2021 at 7:24 AM Jeff Johnson <jeff.johnson at aeoncomputing.com>
wrote:

> What devices are underneath dm-21 and are there any errors in
> /var/log/messages for those devices? (assuming /dev/sdX devices underneath)
>
> Run `ls /sys/block/dm-21/slaves` to see what devices are beneath dm-21
>
>
>
>
>
> On Tue, Jul 6, 2021 at 20:09 David Cohen <cdavid at physics.technion.ac.il>
> wrote:
>
>> Hi,
>> The index of the OST is unique in the system and free for the new one, as
>> it is increased by "1" for every new OST created, so whatever it converts
>> to should not be relevant to it's refusal to mount, or am I mistaken?
>>
>> I'm pasting the log messages again, in case they were lost up the thread,
>> adding the output of "fdisk -l", should the OST size be the issue:
>>
>> lctl dk show tens of thousands of lines repeating the same error after
>> attempting to mount the OST:
>>
>> 00100000:10000000:26.0:1625546374.322973:0:248211:0:(osd_scrub.c:2039:osd_ios_scan_one())
>> local-OST0033: fail to set LMA for init OI scrub: rc = -30
>> 00100000:10000000:26.0:1625546374.322974:0:248211:0:(osd_scrub.c:2039:osd_ios_scan_one())
>> local-OST0033: fail to set LMA for init OI scrub: rc = -30
>> 00100000:10000000:26.0:1625546374.322975:0:248211:0:(osd_scrub.c:2039:osd_ios_scan_one())
>> local-OST0033: fail to set LMA for init OI scrub: rc = -30
>>
>> in /var/log/messages I see the following corresponding to dm21 which is
>> the new OST:
>>
>> Jul  6 07:38:37 oss03 kernel: LDISKFS-fs warning (device dm-21):
>> ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected,
>> please wait.
>> Jul  6 07:39:19 oss03 kernel: LDISKFS-fs (dm-21): file extents enabled,
>> maximum tree depth=5
>> Jul  6 07:39:19 oss03 kernel: LDISKFS-fs warning (device dm-21):
>> ldiskfs_clear_journal_err:4862: Filesystem error recorded from previous
>> mount: IO failure
>> Jul  6 07:39:19 oss03 kernel: LDISKFS-fs warning (device dm-21):
>> ldiskfs_clear_journal_err:4863: Marking fs in need of filesystem check.
>> Jul  6 07:39:19 oss03 kernel: LDISKFS-fs (dm-21): warning: mounting fs
>> with errors, running e2fsck is recommended
>> Jul  6 07:39:22 oss03 kernel: LDISKFS-fs (dm-21): recovery complete
>> Jul  6 07:39:22 oss03 kernel: LDISKFS-fs (dm-21): mounted filesystem with
>> ordered data mode. Opts:
>> user_xattr,errors=remount-ro,acl,no_mbcache,nodelalloc
>> Jul  6 07:39:22 oss03 kernel: LDISKFS-fs error (device dm-21):
>> htree_dirblock_to_tree:1278: inode #2: block 21233: comm mount.lustre: bad
>> entry in directory: rec_len is too small for name_len - offset=4084(4084),
>> inode=0, rec_len=12
>> , name_len=0
>> Jul  6 07:39:22 oss03 kernel: Aborting journal on device dm-21-8.
>> Jul  6 07:39:22 oss03 kernel: LDISKFS-fs (dm-21): Remounting filesystem
>> read-only
>> Jul  6 07:39:24 oss03 kernel: LDISKFS-fs warning (device dm-21):
>> kmmpd:187: kmmpd being stopped since filesystem has been remounted as
>> readonly.
>> Jul  6 07:44:22 oss03 kernel: LDISKFS-fs (dm-21): error count since last
>> fsck: 6
>> Jul  6 07:44:22 oss03 kernel: LDISKFS-fs (dm-21): initial error at time
>> 1625367384: htree_dirblock_to_tree:1278: inode 2: block 21233
>> Jul  6 07:44:22 oss03 kernel: LDISKFS-fs (dm-21): last error at time
>> 1625546362: htree_dirblock_to_tree:1278: inode 2: block 21233
>>
>> fdisk -l /dev/mapper/OST0051
>>
>> Disk /dev/mapper/OST0051: 142799.1 GB, 142799072657408 bytes, 34863054848
>> sectors
>> Units = sectors of 1 * 4096 = 4096 bytes
>> Sector size (logical/physical): 4096 bytes / 4096 bytes
>> I/O size (minimum/optimal): 2097152 bytes / 2097152 bytes
>>
>>
>> Thanks,
>> David
>>
>> On Tue, Jul 6, 2021 at 10:35 PM Spitz, Cory James <cory.spitz at hpe.com>
>> wrote:
>>
>>> What OST index (number) were you trying to add?
>>>
>>>
>>>
>>> Andreas is right:
>>>
>>> Note that your "--index=0051" value is probably interpreted as an octal
>>> number "41", it should be "--index=0x0051" or "--index=0x51" (hex, to match
>>> the OST device name) or "--index=81" (decimal).
>>>
>>>
>>>
>>> And you said:
>>>
>>> I'm aware that index 51 actually translates to hex 33
>>> (local-OST0033_UUID).
>>>
>>>
>>>
>>> Ok, 0051 (in octal by way of the leading zeros*) translates to decimal
>>> 41 as Andreas pointed out, but that’s 0x29 in hexadecimal, not 0x33.
>>> Assuming you wanted to use decimal 51 then you’d have tried to mkfs.lustre
>>> the wrong index.  So, if you wanted to use decimal 51, you’d have use
>>> –index=0x33 or –index=0063.
>>>
>>>
>>>
>>> -Cory
>>>
>>>
>>>
>>> p.s.
>>>
>>> (*) BTW, the convention with leading zeros for octal can be googled or
>>> read about at https://en.wikipedia.org/wiki/Octal.
>>>
>>>
>>>
>>>
>>>
>>> On 7/6/21, 12:35 AM, "lustre-discuss on behalf of David Cohen" <
>>> lustre-discuss-bounces at lists.lustre.org on behalf of
>>> cdavid at physics.technion.ac.il> wrote:
>>>
>>>
>>>
>>> Thanks Andreas,
>>>
>>> I'm aware that index 51 actually translates to hex 33
>>> (local-OST0033_UUID).
>>> I don't believe that's the reason for the failed mount as it is only an
>>> index that I increase for every new OST and there are no duplicates.
>>>
>>>
>>>
>>> lctl dk show tens of thousands of lines repeating the same error after
>>> attempting to mount the OST:
>>>
>>>
>>>
>>> 00100000:10000000:26.0:1625546374.322973:0:248211:0:(osd_scrub.c:2039:osd_ios_scan_one())
>>> local-OST0033: fail to set LMA for init OI scrub: rc = -30
>>>
>>> 00100000:10000000:26.0:1625546374.322974:0:248211:0:(osd_scrub.c:2039:osd_ios_scan_one())
>>> local-OST0033: fail to set LMA for init OI scrub: rc = -30
>>>
>>> 00100000:10000000:26.0:1625546374.322975:0:248211:0:(osd_scrub.c:2039:osd_ios_scan_one())
>>> local-OST0033: fail to set LMA for init OI scrub: rc = -30
>>>
>>>
>>>
>>> in /var/log/messages I see the following corresponding to dm21 which is
>>> the new OST:
>>>
>>>
>>>
>>> Jul  6 07:38:37 oss03 kernel: LDISKFS-fs warning (device dm-21):
>>> ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected,
>>> please wait.
>>>
>>> Jul  6 07:39:19 oss03 kernel: LDISKFS-fs (dm-21): file extents enabled,
>>> maximum tree depth=5
>>> Jul  6 07:39:19 oss03 kernel: LDISKFS-fs warning (device dm-21):
>>> ldiskfs_clear_journal_err:4862: Filesystem error recorded from previous
>>> mount: IO failure
>>> Jul  6 07:39:19 oss03 kernel: LDISKFS-fs warning (device dm-21):
>>> ldiskfs_clear_journal_err:4863: Marking fs in need of filesystem check.
>>> Jul  6 07:39:19 oss03 kernel: LDISKFS-fs (dm-21): warning: mounting fs
>>> with errors, running e2fsck is recommended
>>> Jul  6 07:39:22 oss03 kernel: LDISKFS-fs (dm-21): recovery complete
>>> Jul  6 07:39:22 oss03 kernel: LDISKFS-fs (dm-21): mounted filesystem
>>> with ordered data mode. Opts:
>>> user_xattr,errors=remount-ro,acl,no_mbcache,nodelalloc
>>> Jul  6 07:39:22 oss03 kernel: LDISKFS-fs error (device dm-21):
>>> htree_dirblock_to_tree:1278: inode #2: block 21233: comm mount.lustre: bad
>>> entry in directory: rec_len is too small for name_len - offset=4084(4084),
>>> inode=0, rec_len=12
>>> , name_len=0
>>> Jul  6 07:39:22 oss03 kernel: Aborting journal on device dm-21-8.
>>> Jul  6 07:39:22 oss03 kernel: LDISKFS-fs (dm-21): Remounting filesystem
>>> read-only
>>> Jul  6 07:39:24 oss03 kernel: LDISKFS-fs warning (device dm-21):
>>> kmmpd:187: kmmpd being stopped since filesystem has been remounted as
>>> readonly.
>>> Jul  6 07:44:22 oss03 kernel: LDISKFS-fs (dm-21): error count since last
>>> fsck: 6
>>> Jul  6 07:44:22 oss03 kernel: LDISKFS-fs (dm-21): initial error at time
>>> 1625367384: htree_dirblock_to_tree:1278: inode 2: block 21233
>>> Jul  6 07:44:22 oss03 kernel: LDISKFS-fs (dm-21): last error at time
>>> 1625546362: htree_dirblock_to_tree:1278: inode 2: block 21233
>>>
>>> As I mentioned before mount never completes so the only way out of that
>>> is force reboot.
>>>
>>> Thanks,
>>> David
>>>
>>>
>>>
>>> On Tue, Jul 6, 2021 at 8:07 AM Andreas Dilger <adilger at whamcloud.com>
>>> wrote:
>>>
>>>
>>>
>>>
>>>
>>> On Jul 5, 2021, at 09:05, David Cohen <cdavid at physics.technion.ac.il>
>>> wrote:
>>>
>>>
>>>
>>> Hi,
>>>
>>> I'm using Lustre 2.10.5 and lately tried to add a new OST.
>>>
>>> The OST was formatted with the command below, which other than the index
>>> is the exact same one used for all the other OSTs in the system.
>>>
>>>
>>>
>>> mkfs.lustre --reformat --mkfsoptions="-t ext4 -T huge" --ost
>>> --fsname=local  --index=0051 --param ost.quota_type=ug
>>> --mountfsoptions='errors=remount-ro,extents,mballoc' --mgsnode=10.0.0.3 at tcp
>>> --mgsnode=10.0.0.1 at tc
>>>
>>> p --mgsnode=10.0.0.2 at tcp --servicenode=10.0.0.3 at tcp
>>> --servicenode=10.0.0.1 at tcp --servicenode=10.0.0.2 at tcp
>>> /dev/mapper/OST0051
>>>
>>>
>>>
>>> Note that your "--index=0051" value is probably interpreted as an octal
>>> number "41", it should be "--index=0x0051" or "--index=0x51" (hex, to match
>>> the OST device name) or "--index=81" (decimal).
>>>
>>>
>>>
>>>
>>>
>>> When trying to mount the with:
>>> mount.lustre /dev/mapper/OST0051 /Lustre/OST0051
>>>
>>>
>>>
>>> The system stays on 100% CPU (one core) forever and the mount never
>>> completes, not even after a week.
>>>
>>>
>>> I tried tunefs.lustre --writeconf --erase-params on the MDS and all the
>>> other targets, but the behaviour remains the same.
>>>
>>>
>>>
>>> Cheers, Andreas
>>>
>>> --
>>>
>>> Andreas Dilger
>>>
>>> Lustre Principal Architect
>>>
>>> Whamcloud
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
> --
> ------------------------------
> Jeff Johnson
> Co-Founder
> Aeon Computing
>
> jeff.johnson at aeoncomputing.com
> www.aeoncomputing.com
> t: 858-412-3810 x1001   f: 858-412-3845
> m: 619-204-9061
>
> 4170 Morena Boulevard, Suite C - San Diego, CA 92117
>
> High-Performance Computing / Lustre Filesystems / Scale-out Storage
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20210707/c82c4e9c/attachment-0001.html>


More information about the lustre-discuss mailing list