[Lustre-discuss] MDT backup
aik at fnal.gov
Mon Jan 19 20:46:23 PST 2009
what is the right way to backup MDT ? People get worried what will be
"The Day After" the catastrophic disk failure.
We are about to release lustre system into production, preproduction
testing went good so far (Thanks !).
- At present we do LVM snapshots to backup MDT and we were able to
restore snapshot to another node.
Is there any way to capture changes made on MDT after snapshot done and
how close can we get to point of crash ? Is there some kind of MDT
journal synchronized with LVM snapshot ? Is there way to do incremental
backups to do it more often ?
- What is an experience with DRBD replication ? There are multiple
reports from sites using it and there are also there are reports
indicating replicated file system is not clean when master MDT crashes
as DRBD knows nothing and does not synchronize with file system on top
Is there way to avoid corruption or it just fixed by fsck ? Can DRBD
failover "cleanly" if we do it manually e.g. to upgrade master MDS ?
Can I verify slave disk is consistent with master and is not corrupt
after a year of running ?
It seems like both LVM and DRBD approaches are not perfect. Are there
plans to implement native replication of MDT in lustre ?
SNS is for OSTs only, right ?
I would appreciate to hear about experience on MDT backup.
More information about the lustre-discuss