[Lustre-discuss] MDT backup procedure

Jim Garlick garlick at llnl.gov
Fri Jun 19 08:59:29 PDT 2009


On Fri, Jun 19, 2009 at 10:39:06AM +0200, Ramiro Alba Queipo wrote:
> Andreas,
> 
> This is a very interesting discussion, and it has raised some doubts on
> the matter.
> 
> On Thu, 2009-06-18 at 10:24 -0600, Andreas Dilger wrote:
> > On Jun 18, 2009  11:32 +0200, Ramiro Alba Queipo wrote:
> > > There are 3 ways of doing an MDT backup:
> > > 
> > > 1) Device-level using dd command
> > > 
> > > You can do it from the original device to another local device with at
> > > least the same capacity, BUT no clients and no OSTs should be active, so
> > > NOT SUITABLE for an automated nightly backup
> > 
> > Well, "no clients/OSTs should be active" is a relative term.  You will
> > almost certainly have a usable backup even if the filesystem was active,
> > because ext3 has a robust on-disk layout, but you would need to run an
> > e2fsck afterward.
> > 
> > > 2) File-level using tar or rsync commands
> > > 
> > > You can make a copy to other directory (even remotely) BUT you MUST STOP
> > > lustre and remount it as an 'ldiskfs' file system type. You also have to
> > > save aditional information (cd /lustre/mds; getfattr -R -d -m '.*' -P .
> > > > /<backup-dir>/ea.bak). So NOT SUITABLE for an automated nightly backup
> > > either
> > 
> > Right.  Note that when using "tar" or "rsync" you should use the "--sparse"
> > option so that it doesn't back up empty files.  Also, with newer versions
> 
> Can you tell me which versions? (I am using Ubuntu 8.04 with tar-1.19
> and rsync-2.6.9).

The versions we tested:
- tar-1.20-5 from Fedora 10 works.
- tar-1.15.1-23.0.1 from RHEL 5 does NOT work

Also, for file level backup, exclude /OBJECTS/* and /CATALOGS from the
backup, and make sure clients are unmounted during the restore or their
caches will become corrupt when the restored MDS comes back online
(due to changing inode numbers on the backing fs I believe).

The procedure that we tested a while back is as follows (to which I would
add Andreas's suggestion of --sparse):

   # Backup
   mount -t ldiskfs -ouser_xattr /dev/sda /mnt/mdt
   tar --xattrs --no-selinux --exclude './OBJECTS/*' \
       --exclude './CATALOGS' -C/mnt/mdt -cf backup.tar

   # Restore
   mount -t ldiskfs -ouser_xattr /dev/sda /mnt/mdt
   tar -C/mnt/mdt -xf backup.tar
   # (be afraid if this command produces no output)
   getfattr -d -m ".*" -R /mnt/mdt | grep trusted.lov | more

> > of tar (on RHEL/FC) and rsync it is possible to have it do the backup/restore
> > of the extended attributes directly.
> 
> You mean there is no need use getfattr/setfattr commands?
> 
> > 
> > You could also use "dump-0.4b40" (or later) to do a hybrid device/file
> > level backup.  It will back up the filesystem directly from the block device,
> > but only the files that are in use.  Versions 0.4b40+ can also do the
> > backup/restore of extended attributes, which is critical.
> > 
> > > 3) File-level on LVM snapshots
> > > 
> > > LVM allows you to make a duplication of the MDT while lustre file system
> > > is operational, so you can make afterwards a File-level backup of the
> > > LVM snapshot while everything is running. Then it IS SUITABLE for an
> > > automated backup.
> > > Disadvantages are that you need extra local space for LVM snapshots and
> > > the impact on performance of using LVM over the MDT.
> > 
> > This is probably the best option.  It allows consistent backups to be
> > done, and if you only keep a single snapshot the performance hit isn't
> > too big.
> 
> So, the best option for automated backups could be the use of LVM
> snapshots and then use 'dump' with dump levels over the mounted
> snapshot. No needed the use of getfattr/setfattr commands, right?
> 
> What about performance influence of LMV for MDT on the overall Lustre
> performance? 
> 
> > 
> > > By the way. The procedure described at 'How do I replace an OST or MDS?'
> > > in Apendix B of Lustre Operational Manual differs from procedure
> > > discribed at 15.1.3.1 (Backing Up an MDS File):
> > > - getfattr -R -d -m '.*' -P . > ea.bak
> > > - getfattr -R -e base64 -d . > /tmp/mdsea
> > 
> > I would say the first one is better, though I like to use "-e hex"
> > instead of "-e base64" because the hex output is easier for me to
> > decode if I need to for some reason.  Probably the "replace an OST/MDT"
> > chapter should just reference the backup/restore section instead of
> > duplicating the content.
> > 
> > > On Wed, 2009-06-17 at 16:23 -0600, Andreas Dilger wrote:
> > > > On Jun 17, 2009  12:35 -0700, Cliff White wrote:
> > > > > Ramiro Alba Queipo wrote:
> > > > > > By reading Chapter 15 of Lustre Operations Manual, it follows that an
> > > > > > MDT backup is only useful if you are changing hardwary or the like.
> > > > > > I am afraid that you can not pretend to replace with a previous image an
> > > > > > failed MDT, as data in OSTs and MDT is not matching any more, right?
> > > > > 
> > > > > If you do a backup/immediate restore, it should be fine. If you restore 
> > > > > from an old image you will lose the changes made post-backup, but the 
> > > > > rest of the data should be fine.
> > > > > cliffw
> > > > 
> > > > Right - just like any backup, any changes made after the backup will of
> > > > course not be restored.  One additional issue is that some OST objects
> > > > will not be available if they were deleted after the backup, even though
> > > > the restored MDS will still reference them.  Accessing these files will
> > > > return -ENOENT.
> > > > 
> > > > At that point it would be possible (though not necessary) to run "lfsck"
> > > > to clean up the inconsistencies between the MDT and OST filesystems.
> > > > It is also possible to just re-delete the files that have "-ENOENT" and
> > > > restore (from some other filesystem-level backup) the rest of the files.
> > > > 
> > > > An MDS backup is a good idea, because it avoids having to restore 100TB+
> > > > (or whatever) of data from backup, leaving only a smaller number of changed
> > > > files that might need to be restored.  It should NOT be the only form of
> > > > backup for the filesystem, since it does not contain any of the FILE data.
> > > > You, or your users, should do backups of their critical files separately.
> > > > 
> > > > > > On Wed, 2009-06-17 at 09:41 -0600, Daniel Kulinski wrote:
> > > > > >> As we move forward with our lustre testing I am wondering about MDT
> > > > > >> backup.  
> > > > > >>
> > > > > >>  
> > > > > >>
> > > > > >> Is it feasible to unmount the MDT, create an image of it and remount
> > > > > >> it after the backup.  Of course this wouldn’t happen but nightly.
> > > > > >>
> > > > > >>  
> > > > > >>
> > > > > >> From what I can identify, in the case of an MDT failure we would have
> > > > > >> to do the following:
> > > > > >>
> > > > > >>  
> > > > > >>
> > > > > >> Restore from the last backup.
> > > > > >>
> > > > > >> Run an lfsck across the filesystem.
> > > > > >>
> > > > > >>  
> > > > > >>
> > > > > >> Am I missing anything else at this point?  We will also be doing file
> > > > > >> level backups of the filesystem as a whole but we are looking for
> > > > > >> quick ways to recover from an MDT failure.
> > > > > >>
> > > > > >>  
> > > > > >>
> > > > > >> Thanks,
> > > > > >>
> > > > > >>   Dan Kulinski
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> -- 
> > > > > >> Aquest missatge ha estat analitzat per MailScanner 
> > > > > >> a la cerca de virus i d'altres continguts perillosos, 
> > > > > >> i es considera que está net. 
> > > > > >> MailScanner agraeix a transtec Computers pel seu suport. 
> > > > > >> _______________________________________________
> > > > > >> Lustre-discuss mailing list
> > > > > >> Lustre-discuss at lists.lustre.org
> > > > > >> http:// lists.lustre.org/mailman/listinfo/lustre-discuss
> > > > > >>
> > > > > >> ------------------------------------------------------------------------
> > > > > >>
> > > > > >> _______________________________________________
> > > > > >> Lustre-discuss mailing list
> > > > > >> Lustre-discuss at lists.lustre.org
> > > > > >> http:// lists.lustre.org/mailman/listinfo/lustre-discuss
> > > > > 
> > > > > _______________________________________________
> > > > > Lustre-discuss mailing list
> > > > > Lustre-discuss at lists.lustre.org
> > > > > http:// lists.lustre.org/mailman/listinfo/lustre-discuss
> > > > 
> > > > Cheers, Andreas
> > > > --
> > > > Andreas Dilger
> > > > Sr. Staff Engineer, Lustre Group
> > > > Sun Microsystems of Canada, Inc.
> > > > 
> > > > 
> > > -- 
> > > Ramiro Alba
> > > 
> > > Centre Tecnològic de Tranferència de Calor
> > > http:// www. cttc.upc.edu
> > > 
> > > 
> > > Escola Tècnica Superior d'Enginyeries
> > > Industrial i Aeronàutica de Terrassa
> > > Colom 11, E-08222, Terrassa, Barcelona, Spain
> > > Tel: (+34) 93 739 86 46
> > > 
> > > 
> > > -- 
> > > Aquest missatge ha estat analitzat per MailScanner
> > > a la cerca de virus i d'altres continguts perillosos,
> > > i es considera que est? net.
> > > For all your IT requirements visit: http:// www. transtec.co.uk
> > > 
> > 
> > > _______________________________________________
> > > Lustre-discuss mailing list
> > > Lustre-discuss at lists.lustre.org
> > > http:// lists.lustre.org/mailman/listinfo/lustre-discuss
> > 
> > 
> > Cheers, Andreas
> > --
> > Andreas Dilger
> > Sr. Staff Engineer, Lustre Group
> > Sun Microsystems of Canada, Inc.
> > 
> > 
> -- 
> Ramiro Alba
> 
> Centre Tecnològic de Tranferència de Calor
> http:// www. cttc.upc.edu
> 
> 
> Escola Tècnica Superior d'Enginyeries
> Industrial i Aeronàutica de Terrassa
> Colom 11, E-08222, Terrassa, Barcelona, Spain
> Tel: (+34) 93 739 86 46
> 
> 
> -- 
> Aquest missatge ha estat analitzat per MailScanner
> a la cerca de virus i d'altres continguts perillosos,
> i es considera que est? net.
> For all your IT requirements visit: http:// www. transtec.co.uk
> 

> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http:// lists.lustre.org/mailman/listinfo/lustre-discuss




More information about the lustre-discuss mailing list