[lustre-discuss] ZFS wobble

Alastair Basden a.g.basden at durham.ac.uk
Thu Apr 28 01:10:31 PDT 2022


Hi,

We have OSDs on ZFS (0.7.9) / Lustre 2.12.6.

Recently, one of our JBODs had a wobble, and the disks (as presented to 
the OS) disappeared for a few seconds (and then returned).

This upset a few zpools which SUSPENDED.

A zpool clear on these then started the resilvering process, and zpool 
status gave e.g.:
errors: Permanent errors have been detected in the following files:

         <metadata>:<0x0>
         <metadata>:<0xb01>
         <metadata>:<0x15>
         <metadata>:<0x383>
         cos6-ost7/ost7:/O/400000400/d11/10617643
         cos6-ost7/ost7:/O/400000400/d21/583029


However, once the resilvering had completed, these permanent errors had 
gone.

The question is then, are these errors really permanent, or was zfs able 
to correct them?

Lustre continues to remain fine (though obviously froze while the pools 
were suspended).

Should we be worried that there might be some under-the-hood corruption 
that will present itself when we need to remount (e.g. after a reboot) the 
OST?  In particular the <metadata>:<0x0> file worries me a bit!

Thanks,
Alastair.


More information about the lustre-discuss mailing list