[Lustre-discuss] How do I recover files from partial lustre disk?

Andreas Dilger adilger at Sun.COM
Mon Jun 23 13:54:45 PDT 2008


On Jun 20, 2008  17:27 -0400, Ms. Megan Larko wrote:
> This is a follow-up from Megan on 20 June 2008:
> Success getting file information from remaining OST's.
> 
> Per the advice of Andreas, I mounted my good OST's on my OSS.
> I went to the MDT and mounted the /srv/lustre/mds/crew4-MDT0000.
> 
> On a compute  node (not a lustre data OSS node), I mounted the disk
> (/crew4) and then I used the lctl to identify the known bad nids in
> /crew4 and then to "device {bad-nid}   then "deactivate"  that
> bad-nid.   Finally I used Andreas suggestion of "lfs find --ost
> crew4-OST0001_UUID --ost crew4-OST0003_UUID --ost crew4-OST0004_UUID
> --print  /crew4 >& crew4.find.20Jun08"
> 
> I received a 759 Mb text output file of the names of files still
> resident on the remaining OST's.   (...and there was great rejoicing!)
>   So--- I want to cp those known/found file names from the read-only
> mounted device named /crew4 onto some good space.   May I just use a
> linux system "cp" command or is there a better lustre command that
> should be used for this specific task?

If the files are single-stripe files then "cp" is fine.  If the files
have multiple stripes (you can check with "lfs getstripe filename ...")
then you should probably just skip them.

If there is data in a striped file that is valuable even if you only
have e.g. every other 1MB of the file, then you can recover the readable
parts of the file with:

    COUNT=$(($(stat -c {filename}) + 65535) / 65536))
    dd if={filename} of={savefilename} bs=64k count=$COUNT conv=sync,noerror

the unreadable parts of the file will be filled with binary 0 (NUL) bytes.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the lustre-discuss mailing list