[Lustre-discuss] MDT backup procedure

Ramiro Alba Queipo raq at cttc.upc.edu
Thu Jun 18 02:32:36 PDT 2009


Hi all,

In order to clarify ideas, let me to sum up (Please tell me if I am
wrong).

There are 3 ways of doing an MDT backup:

1) Device-level using dd command

You can do it from the original device to another local device with at
least the same capacity, BUT no clients and no OSTs should be active, so
NOT SUITABLE for an automated nightly backup

2) File-level using tar or rsync commands

You can make a copy to other directory (even remotely) BUT you MUST STOP
lustre and remount it as an 'ldiskfs' file system type. You also have to
save aditional information (cd /lustre/mds; getfattr -R -d -m '.*' -P .
> /<backup-dir>/ea.bak). So NOT SUITABLE for an automated nightly backup
either

3) File-level on LVM snapshots

LVM allows you to make a duplication of the MDT while lustre file system
is operational, so you can make afterwards a File-level backup of the
LVM snapshot while everything is running. Then it IS SUITABLE for an
automated backup.
Disadvantages are that you need extra local space for LVM snapshots and
the impact on performance of using LVM over the MDT.


By the way. The procedure described at 'How do I replace an OST or MDS?'
in Apendix B of Lustre Operational Manual differs from procedure
discribed at 15.1.3.1 (Backing Up an MDS File):
- getfattr -R -d -m '.*' -P . > ea.bak
- getfattr -R -e base64 -d . > /tmp/mdsea

Which one is the right one?


Cheers 


On Wed, 2009-06-17 at 16:23 -0600, Andreas Dilger wrote:
> On Jun 17, 2009  12:35 -0700, Cliff White wrote:
> > Ramiro Alba Queipo wrote:
> > > By reading Chapter 15 of Lustre Operations Manual, it follows that an
> > > MDT backup is only useful if you are changing hardwary or the like.
> > > I am afraid that you can not pretend to replace with a previous image an
> > > failed MDT, as data in OSTs and MDT is not matching any more, right?
> > 
> > If you do a backup/immediate restore, it should be fine. If you restore 
> > from an old image you will lose the changes made post-backup, but the 
> > rest of the data should be fine.
> > cliffw
> 
> Right - just like any backup, any changes made after the backup will of
> course not be restored.  One additional issue is that some OST objects
> will not be available if they were deleted after the backup, even though
> the restored MDS will still reference them.  Accessing these files will
> return -ENOENT.
> 
> At that point it would be possible (though not necessary) to run "lfsck"
> to clean up the inconsistencies between the MDT and OST filesystems.
> It is also possible to just re-delete the files that have "-ENOENT" and
> restore (from some other filesystem-level backup) the rest of the files.
> 
> An MDS backup is a good idea, because it avoids having to restore 100TB+
> (or whatever) of data from backup, leaving only a smaller number of changed
> files that might need to be restored.  It should NOT be the only form of
> backup for the filesystem, since it does not contain any of the FILE data.
> You, or your users, should do backups of their critical files separately.
> 
> > > On Wed, 2009-06-17 at 09:41 -0600, Daniel Kulinski wrote:
> > >> As we move forward with our lustre testing I am wondering about MDT
> > >> backup.  
> > >>
> > >>  
> > >>
> > >> Is it feasible to unmount the MDT, create an image of it and remount
> > >> it after the backup.  Of course this wouldn’t happen but nightly.
> > >>
> > >>  
> > >>
> > >> From what I can identify, in the case of an MDT failure we would have
> > >> to do the following:
> > >>
> > >>  
> > >>
> > >> Restore from the last backup.
> > >>
> > >> Run an lfsck across the filesystem.
> > >>
> > >>  
> > >>
> > >> Am I missing anything else at this point?  We will also be doing file
> > >> level backups of the filesystem as a whole but we are looking for
> > >> quick ways to recover from an MDT failure.
> > >>
> > >>  
> > >>
> > >> Thanks,
> > >>
> > >>   Dan Kulinski
> > >>
> > >>
> > >>
> > >> -- 
> > >> Aquest missatge ha estat analitzat per MailScanner 
> > >> a la cerca de virus i d'altres continguts perillosos, 
> > >> i es considera que está net. 
> > >> MailScanner agraeix a transtec Computers pel seu suport. 
> > >> _______________________________________________
> > >> Lustre-discuss mailing list
> > >> Lustre-discuss at lists.lustre.org
> > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> > >>
> > >> ------------------------------------------------------------------------
> > >>
> > >> _______________________________________________
> > >> Lustre-discuss mailing list
> > >> Lustre-discuss at lists.lustre.org
> > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> > 
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
> 
> 
-- 
Ramiro Alba

Centre Tecnològic de Tranferència de Calor
http://www.cttc.upc.edu


Escola Tècnica Superior d'Enginyeries
Industrial i Aeronàutica de Terrassa
Colom 11, E-08222, Terrassa, Barcelona, Spain
Tel: (+34) 93 739 86 46


-- 
Aquest missatge ha estat analitzat per MailScanner
a la cerca de virus i d'altres continguts perillosos,
i es considera que està net.
For all your IT requirements visit: http://www.transtec.co.uk




More information about the lustre-discuss mailing list