[Lustre-discuss] OST crash with group descriptors

thhsieh thhsieh at piano.rcas.sinica.edu.tw
Fri Mar 13 09:11:51 PDT 2009


Dear Andreas,

Thanks so much for your valuable suggestion. Using the

  ll_recover_lost_found_objs [-hv] -d lost+found_directory

command I have recovered most of the lost files from "lost+found".
I am very appreciate your kindly help.


Best Regards,

T.H.Hsieh


On Fri, Mar 13, 2009 at 06:23:04AM -0600, Andreas Dilger wrote:
> On Mar 13, 2009  11:03 +0800, thhsieh wrote:
> > There is another tip I can share here. After following Andreas's
> > suggestions, we finally got back all the OSTs. But still there
> > are a lot of files cannot be recovered. If you use "ls -l" command,
> > you can very easily to identify such kind of files:
> > 
> > -rw-r--r-- 1 thhsieh thhsieh  61440008 2007-05-21 18:49 EIV27
> > -rw-r--r-- 1 thhsieh thhsieh  61440008 2007-05-21 18:49 EIV28
> > ?--------- ? ?       ?               ?                ? EIV29
> > -rw-r--r-- 1 thhsieh thhsieh  61440008 2007-05-21 18:49 EIV30
> > -rw-r--r-- 1 thhsieh thhsieh     19488 2008-09-18 16:04 fort.8
> > 
> > where "EIV29" is the corrupted file.
> 
> Right, because "ls -l" got an error when reading the size for
> this file.
> 
> > Then in /mnt/lost+found/, you may see a lot of losted files there.
> > But still difficult to identify which one is which.
> > 
> > If we can know the features of the original file, e.g., its creating or
> > last modifying time, its roughly size, its owner, or its type, then its
> > is still possible to pick up the correct one. For example, yesterday
> > I tried to correctly pick up the "Zip archived" file from thousands of
> > files, by picking out the files belong to the owner, and use the
> > 
> > 	file <filename>
> > 
> > to check its original format. Very fortunately there is only one "Zip"
> > format file, so that is it.
> > 
> > Since this technique is very tedious, but still cannot guarantee to
> > recover files, it is only useful to recover a few files which may be
> > the most critical.  However, if you do have very important file which
> > can not be losted, then this way may be worth to try.
> 
> There is a tool specifically for this, which I mentioned in my earlier
> email "ll_recover_lost_found_objs", which will run against the ldiskfs
> mounted filesystem:
> 
> Usage: ./lustre/utils/ll_recover_lost_found_objs [-hv] -d lost+found_directory
> You need to mount the corrupted OST filesystem andprovide the path for the
> lost+found directory as the -d option, for example:
> ll_recover_lost_found_objs -d /mnt/ost/lost+found
> 
> 
> This will move all (or at least most) of the objects from lost+found
> back to their place in the O/0/d* directories, and you will have most
> of your files back.
> 
> 
> The first time Lustre writes to an object it saves the MDS inode number
> and the objid as an extended attribute on the object, so that in the
> case of a directory corruption on the OST it is possible to recover,
> as you need to do.
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
> 



More information about the lustre-discuss mailing list