[Lustre-discuss] best practice for lustre clustre startup

Sat Jul 3 22:00:34 PDT 2010

On 2010-07-03, at 15:02, pg_lus at lus.for.sabi.co.UK wrote:
>> Note that if you are not running with writeback cache enabled
>> on the disks, then you shouldn't have to run an fsck on the
>> filesystems after a crash.
> 
> This seems to me extremely bad advice, based on these rather
> extraordinarily optimistic assumptions:
> 
>> That should only be needed if the storage is faulty, or if it
>> is using writeback cache without mirroring and battery backup.
> 
> This reminds me of the immortal statement "as far as we know in
> our datacenter we never had an undetected error".

I think my record speaks for itself in terms of advocating running fsck on filesystems on a regular basis. I think you are making assumptions about what my statement says or does not say. What it says is that you shouldn't need to run fsck after a crash, if this wasn't involving e.g. RAID controller failure or the loss of writeback cache.

It doesn't say that you should never run fsck, and in fact I always recommend a full fsck in case on RAID failure or if the filesystem has detected inconsistencies.

My point was that if there are uptime requirements that running a full fsck after an unplanned outage of one node  is probably a bad use of time. It would be better to run a full fsck on ALL of the filesystems during scheduled maintenance windows, since they can be run in parallel and wouldn't take longer than a single node. 

I have also written the lvcheck tool to run fsck on LVM snapshots via cron on a regular basis so that you don't need to wait for a crash before validating whether your hardware is faulty.

> a full scan, at least every now and then, is essential to give some
> confidence that no hidden problem has been eating the metadata.

I've been a staunch advocate among the ext4 developers for keeping the  periodic fsck at mount time to catch those places that never fsck on their own. If that bothers people because of the unexpected delay in startup, I point them at the script so they can check the snapshot and reset the fsck counters before they expire.