[Lustre-discuss] MDT move aka backup w rsync

Thomas Roth t.roth at gsi.de
Thu Jul 16 06:51:04 PDT 2009


For the record, I should add that I had forgotten one step which proves
to be important, also mentioned before on this list:

After copying with rsync, I had to
cd   /srv/mdt;
rm  CATALOGS  OBJECTS/*
on the new MDT partition.

Otherwise the OSTs are kicked out on remount with  "error looking up
logfile ...: rc -2" and "OST0000_UUID sync failed -2, deactivating"

Regards,
Thomas

Thomas Roth wrote:
> Hi all,
> 
> I want to move a MDT from one server to another. After studying some
> mails concerning MDT backup, I've just tried (successfully, it seems) to
> do that on a small test system  with rsync:
> 
> - Stop Lustre, umount all servers.
> - Format a suitable disk partition on the new hardware, using the same
> mkfs-options as for the original MDT.
> - Mount the original MDT:    mount   -t ldiskfs      /dev/sdb1    /mnt
> - Mount the target partition: mount   -t ldiskfs   -O ext_attr
> /dev/sdb1    /mnt
> - Copy the data:  rsync   -Xav   oldserver:/mnt/    newserver:/mnt
> - Umount partitions, restart MGS
> - Mount new MDT
> 
> This procedure was described by Jim Garlick on this list. You might note
> that I used the mount option "-O ext_attr" only on the target machine:
> my mistake perhaps, but no visible problems. In fact, I haven't found
> this option mentioned in any man page or on the net. Nevertheless, my
> mount command did not complain about it. So I wonder whether it is
> necessary at all - I seem to have extracted the attributes from the old
> MDT all right, without this mount option - ?
> 
> My main question is whether this is a correct procedure for MDT backups,
> or rather copies.
> 
> I'm investigating this because our production MDT seems to have a number
> of problems. In particular the underlying file system is in bad shape,
> fsck correcting a large number of ext3-errors, incorrect inodes and so
> forth. We want to verify that it is not a hardware issue - bit-flipping
> RAID controller, silent "memory corruption", whatever. We have a
> DRBD-mirror of this MDT running, but of course DRBD just reproduces all
> errors on the mirror.  Copying from one ldiskfs to another should avoid
> that?
> 
> The traditional backup method of getting the EAs and tar-ing the MDT
> doesn't finish in finite time. It did before, and the filesystem has
> since grown by a mere 40GB of data, so it shouldn't take that much
> longer - certainly another indication that there is something wrong.
> Of course I have yet to see whether "rsync -Xav" does much better on the
> full system ;-)
> 
> Hm, not sure whether this all makes sense.
> 
> The system runs Debian Etch, kernel 2.6.22, Lustre 1.6.7.1
> 
> Regards,
> Thomas
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

-- 
--------------------------------------------------------------------
Thomas Roth
Department: Informationstechnologie
Location: SB3 1.262
Phone: +49-6159-71 1453  Fax: +49-6159-71 2986

GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1
D-64291 Darmstadt
www.gsi.de

Gesellschaft mit beschränkter Haftung
Sitz der Gesellschaft: Darmstadt
Handelsregister: Amtsgericht Darmstadt, HRB 1528

Geschäftsführer: Professor Dr. Horst Stöcker

Vorsitzende des Aufsichtsrates: Dr. Beatrix Vierkorn-Rudolph,
Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt



More information about the lustre-discuss mailing list