[Lustre-discuss] Backing up the OSSs and MDSs

Andreas Dilger adilger at sun.com
Tue Nov 4 14:43:45 PST 2008


On Oct 31, 2008  19:38 +0100, Tim Bell wrote:
> For the OSSs, from what I can see, this can be achieved on a per-volume 
> basis using LVM and snapshotting.  However, I feel uncomfortable with 
> this approach and would prefer something where I can restore an 
> individual file from backup if lost.

I would strongly recommend filesystem-level backups instead of doing
device-level backups.  This gives you full control of which files to
backup, and allows restoring individual files as needed.  It is also
easiest to integrate into existing backup solutions.

It is probably also desirable to do a device-level backup of the MDS,
because it is the central part of the Lustre filesystem, and if this
device fails permanently then the entire filesystem would need to be
restored.  Lustre can generally handle if the MDS backup is slightly
out of date w.r.t. the OSTs, though care must be taken when restoring
in such a scenario.

> Is there another way other than 
> find with mtime to find the list of modified files in the file system ?

There is the "e2scan" program that is included with recent versions of
the lustre e2fsprogs RPM.  It is a fast scanner of the MDS filesystem
that returns a list of files that have changed, like "find".  You can
also use "find" or "lfs find" to traverse the filesystem, and this
can be very fast if run on a client mounted on the MDS node.

In the 2.0 release there will be a changelog exported from Lustre which
will catalog files that have changed in the filesystem since the last
time an application processed the log.

> For the MDSs, I cannot see how to back up an online MDS.  If I do a 
> split mirror without quiesce and backup the ext3 file system of the 
> split-off mirror, I get an inconsistent state of database.

You should use LVM mirroring to ensure the filesystem is coherent
when the snapshot is made.  There are patches going into the upstream
kernel by Takashi Sato of NEC to allow a userspace process to freeze
the filesystem while doing this sort of hardware mirror split
(http://marc.info/?l=linux-fsdevel&m=122511248322724&w=4), but as
yet this is not in the upstream kernel or any Lustre kernel.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the lustre-discuss mailing list