[lustre-discuss] ZFS wobble
Alastair Basden
a.g.basden at durham.ac.uk
Thu Apr 28 01:10:31 PDT 2022
Hi,
We have OSDs on ZFS (0.7.9) / Lustre 2.12.6.
Recently, one of our JBODs had a wobble, and the disks (as presented to
the OS) disappeared for a few seconds (and then returned).
This upset a few zpools which SUSPENDED.
A zpool clear on these then started the resilvering process, and zpool
status gave e.g.:
errors: Permanent errors have been detected in the following files:
<metadata>:<0x0>
<metadata>:<0xb01>
<metadata>:<0x15>
<metadata>:<0x383>
cos6-ost7/ost7:/O/400000400/d11/10617643
cos6-ost7/ost7:/O/400000400/d21/583029
However, once the resilvering had completed, these permanent errors had
gone.
The question is then, are these errors really permanent, or was zfs able
to correct them?
Lustre continues to remain fine (though obviously froze while the pools
were suspended).
Should we be worried that there might be some under-the-hood corruption
that will present itself when we need to remount (e.g. after a reboot) the
OST? In particular the <metadata>:<0x0> file worries me a bit!
Thanks,
Alastair.
More information about the lustre-discuss
mailing list