[Lustre-discuss] How do I recover files from partial lustre disk?

Tue Jun 17 21:48:29 PDT 2008

On Jun 16, 2008  15:37 -0700, megan wrote:
> I am using Lustre 2.6.18-53.1.13.el5_lustre.1.6.4.3smp kernel on a
> CentOS 5 linux x86_64 linux box.
> We had a hardware problem that caused the underlying ext3 partition
> table to completely blow up.  This is resulting in only three of five
> OSTs being mountable.   The main lustre disk of this unit cannot be
> mounted because the MDS knows that two of its parts are missing.

It should be possible to mount a Lustre filesystem with OSTs that
are not available.  However, access to files on the unavailable
OSTs will cause the process to wait on OST recovery.

> The underlying set-up is JBOD hw that is passed to the linux OS, via
> an LSI 8888ELP card in this case, as a simple device, ie. sde,
> sdf,...    The simple devices were partitioned using parted and
> formatted ext3 then lustre was built on top of the five ext3 units.
> There was no striping done across units/JBODS.   Three of the five
> units passed an e2fsck and an lfsck.  Those remaining units are
> mounted as such:
> /dev/sdc               13T  6.3T  5.7T  53% /srv/lustre/OST/crew4-
> OST0003
> /dev/sdd               13T  6.3T  5.7T  53% /srv/lustre/OST/crew4-
> OST0004
> /dev/sdf               13T  6.2T  5.8T  52% /srv/lustre/OST/crew4-
> OST0001
> 
> Being that it is unlikely that we shall be able to recover the
> underlying ext3 on the other two units, is there some method by which
> I might try to rescue the data from these last three units mounted
> currently on the OSS?
> 
> Any and all suggestion genuinely appreciated.

The recoverability of your data depends heavily on the striping of
the individual files (i.e. the default striping).  If your files have
a default stripe_count = 1, then you can probably recover 3/5 of the
files in the filesystem.  If your default stripe_count = 2, then you
can probably only recover 1/5 of the files, and if you have a higher
stripe_count you probably can't recover any files.

What you need to do is to mount one of the clients and mark the
corresponding OSTs inactive with:

	lctl dl    # get device numbers for OSC 0000 and OSC 0002
	lctl --device N deactivate

Then, instead of the clients waiting for the OSTs to recover the
client will get an IO error when it accesses files on the failed OSTs.

To get a list of the files that are on the good OSTs run:

	lfs find --ost crew4-OST0001_UUID --ost crew4-OST0003_UUID
		 --ost crew4-OST0004_UUID {mountpoint}

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.