[lustre-discuss] 2.12.6 freeze
Alastair Basden
a.g.basden at durham.ac.uk
Wed Dec 1 01:09:06 PST 2021
Hi,
Turns out there is a problem with the zpool, which we think got corrupted
by a stonith event when a disk on another pool started to do a predicted
failure.
A zpool scrub has been done, and there are 5 files with permanent errors
(zpool status -v):
errors: Permanent errors have been detected in the following files:
cos8-ost6/ost6:<0xe>
cos8-ost6/ost6:<0x1a>
cos8-ost6/ost6:<0x1c>
cos8-ost6/ost6:/
cos8-ost6/ost6:<0x193>
The fact that / is corrupted seems to worry me!
If we set the canmount=on property and mount the zpool, then an ls of the
mount point gives an Input/output error.
Does anyone have experience with how to repair this?
There is no hardware problem, all 12 disks within this z2 pool are fine -
we think the stonith must have caused it - though I thought zfs was
supposed to be immune to that!
Thanks...
On Tue, 30 Nov 2021, Tommi Tervo wrote:
> [EXTERNAL EMAIL]
>
>> Upon attempting to mount a zfs OST, we are getting:
>> Message from syslogd at c8oss01 at Nov 29 18:11:47 ...
>> kernel:LustreError: 58223:0:(lu_object.c:1267:lu_device_fini())
>> ASSERTION( atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1
>>
>> Message from syslogd at c8oss01 at Nov 29 18:11:47 ...
>> kernel:LustreError: 58223:0:(lu_object.c:1267:lu_device_fini()) LBUG
>
> Hi,
>
> Looks like LU-12675, time to upgrade 2.12.7?
>
> https://jira.whamcloud.com/browse/LU-12675
>
> HTH,
> -Tommi
>
More information about the lustre-discuss
mailing list