[Lustre-discuss] MDT move aka backup w rsync

Thomas Roth t.roth at gsi.de
Wed Jul 15 09:35:45 PDT 2009


Hi all,

I want to move a MDT from one server to another. After studying some
mails concerning MDT backup, I've just tried (successfully, it seems) to
do that on a small test system  with rsync:

- Stop Lustre, umount all servers.
- Format a suitable disk partition on the new hardware, using the same
mkfs-options as for the original MDT.
- Mount the original MDT:    mount   -t ldiskfs      /dev/sdb1    /mnt
- Mount the target partition: mount   -t ldiskfs   -O ext_attr
/dev/sdb1    /mnt
- Copy the data:  rsync   -Xav   oldserver:/mnt/    newserver:/mnt
- Umount partitions, restart MGS
- Mount new MDT

This procedure was described by Jim Garlick on this list. You might note
that I used the mount option "-O ext_attr" only on the target machine:
my mistake perhaps, but no visible problems. In fact, I haven't found
this option mentioned in any man page or on the net. Nevertheless, my
mount command did not complain about it. So I wonder whether it is
necessary at all - I seem to have extracted the attributes from the old
MDT all right, without this mount option - ?

My main question is whether this is a correct procedure for MDT backups,
or rather copies.

I'm investigating this because our production MDT seems to have a number
of problems. In particular the underlying file system is in bad shape,
fsck correcting a large number of ext3-errors, incorrect inodes and so
forth. We want to verify that it is not a hardware issue - bit-flipping
RAID controller, silent "memory corruption", whatever. We have a
DRBD-mirror of this MDT running, but of course DRBD just reproduces all
errors on the mirror.  Copying from one ldiskfs to another should avoid
that?

The traditional backup method of getting the EAs and tar-ing the MDT
doesn't finish in finite time. It did before, and the filesystem has
since grown by a mere 40GB of data, so it shouldn't take that much
longer - certainly another indication that there is something wrong.
Of course I have yet to see whether "rsync -Xav" does much better on the
full system ;-)

Hm, not sure whether this all makes sense.

The system runs Debian Etch, kernel 2.6.22, Lustre 1.6.7.1

Regards,
Thomas




More information about the lustre-discuss mailing list