[Lustre-discuss] recovering formatted OST

Sebastian Gutierrez gutseb at gmail.com
Wed Oct 20 20:04:28 PDT 2010


If everything was on a LVM first you may be able to recover if nothing has been written to the disk.   I am assuming that you do not have your lvm backup files /etc/lvm/backup/.  If you did you could use the pvcreate recovery procedure there are a couple of different walkthroughs here that may help.  

http://www.novell.com/coolsolutions/appnote/19386.html

The syntax is something like this. 

pvcreate --uuid "cTFy1t-Ux56-rtqw-D477-ZbvE-eJgm-zozjao" --restorefile /etc/lvm/backup/<---vg backup---> /dev/sdb

Good luck



On Oct 20, 2010, at 7:32 PM, Andreas Dilger wrote:

> Probably LVM will refuse to create a whole-device PV if there is a partition table. 
> 
> Cheers, Andreas
> 
> On 2010-10-20, at 18:31, Wojciech Turek <wjt27 at cam.ac.uk> wrote:
> 
>> Hi Andres,
>> 
>> If I am going to recreate LVM on the whole device (as it was originaly created) do I still need to overwrite MBR with zeros prior that? I guess creation of the LVM will overwrite it but I am asking just to make sure.
>> 
>> Wojciech
>> 
>> On 20 October 2010 18:40, Andreas Dilger <andreas.dilger at oracle.com> wrote:
>> On 2010-10-20, at 11:36, Wojciech Turek wrote:
>> > Your help is mostly appreciated Andreas. May I ask one more question?
>> > I would like to perform the recovery procedure on the image of the disk (I am making it using dd) rather then the physical device. In order to do that is it enough to bind the image to the loop device and use that loop device as it is was a physical device?
>> 
>> I'm not sure that is 100% safe.  Having an image may result in LVM to create the LVs with different parameters for some reason.  Instead, I'd keep the image as backup and do the recovery on the original device.  Also, the original device is much more likely to run e2fsck faster, which will help you get any remaining data back more quickly.
>> 
>> > On 20 October 2010 17:41, Andreas Dilger <andreas.dilger at oracle.com> wrote:
>> > On 2010-10-20, at 10:15, Wojciech Turek <wjt27 at cam.ac.uk> wrote:
>> >> On 20 October 2010 16:32, Andreas Dilger <andreas.dilger at oracle.com> wrote:
>> >> Right - you need to recreate the LV exactly as it was before. If you created it all at once on the whole LUN then it is likely to be allocated in a linear way. If there are multiple LVs on the same LUN and they were expanded after use the chance of recovering them is very low.
>> >> There was one LVM on that LUN I created it using  following commands:
>> >>
>> >> pvcreate /dev/sdc
>> >> vgcreate ost16vg /dev/sdc
>> >> lvcreate --name ost16v -l 100%VG ost16vg
>> >>
>> >> So in order to recreate that LVM on the formatted LUN i need to repeat above steps, is that right?
>> >
>> > If you know the exact LVM command then you probably don't need findsuper at all, since you should get back your original LV. The findsuper tool is useful if you don't know the original partition layout.
>> >
>> >> That said, if there were filesystems formatted in each partition, the amount of data loss may be large. You may have some saving grace if the first partitions are very small and fit inside the space previously used by the 400MB journal.
>> >> Unfortunately new partitions use much more space than 400mb
>> >>    8    32 7809904640 sdc
>> >>    8    33   10484719 sdc1
>> >>    8    34    4193280 sdc2
>> >>    8    35    4193280 sdc3
>> >>    8    36    8387584 sdc4
>> >>    8    37 7782640640 sdc5
>> >
>> > The only good news is that the new filesystems will be offset from the original filesystem due to the LVM metadata, and you are more likely to have newer data away from the start of the filesystem, so there is some hope of getting some data back.
>> >
>> >
>> >> On 2010-10-20, at 9:06, Wojciech Turek <wjt27 at cam.ac.uk> wrote:
>> >>
>> >>> Thank you for quick reply.
>> >>> Unfortunately all partitions were formatted with ext3, also I didn't mention earlier but the OST was placed on the LVM volume which is now gone as the installation script formatted the physical device. I understand  that this complicates things even further. In that case i guess firstly I need to try to recover the LVM information otherwise fsck will not be able to find anything is that right?
>> >>>
>> >>> Best regards,
>> >>>
>> >>> Wojciech
>> >>>
>> >>> On 20 October 2010 08:46, Andreas Dilger <andreas.dilger at oracle.com> wrote:
>> >>> On 2010-10-19, at 17:01, Wojciech Turek wrote:
>> >>> > Due to the locac disk failure in an OSS one of our /scratch OSTs was formatted by automatic installation script. This script created 5 small partitions and 6th partition consisting of the remaining space on that OST. Nothing else was written to that device since then. Is there a way to recover any data from that OST?
>> >>>
>> >>> Your best bet is to make a full "dd" backup of the OST to a new device (for safety), first restore the original partition table.  If there was not originally a partition table, then you can just erase the new partitions:
>> >>>
>> >>>  dd if=/dev/zero of=/dev/XXX bs=512 count=1
>> >>>
>> >>> Then run e2fsck -fy, followed by "ll_recover_lost_found_objs" (from a newer lustre RPM, if you don't have it).  It is likely that you will get some or most of the data back.  This depends heavily on exactly what was written over the original filesystem.
>> >>>
>> >>> If it was just a new partition table, there should be relatively little damage (ext3 is very robust this way, and can repair itself so long as the starting alignment is correct).  If there were filesystems formatted in each of these partitions, then the amount of data available will be reduced significantly.
>> >>>
>> >>> Cheers, Andreas
>> >>> --
>> >>> Andreas Dilger
>> >>> Lustre Technical Lead
>> >>> Oracle Corporation Canada Inc.
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Wojciech Turek
>> >>
>> >> Senior System Architect
>> >>
>> >> High Performance Computing Service
>> >> University of Cambridge
>> >> Email: wjt27 at cam.ac.uk
>> >> Tel: (+)44 1223 763517
>> >
>> >
>> >
>> > --
>> > Wojciech Turek
>> >
>> > Senior System Architect
>> >
>> > High Performance Computing Service
>> > University of Cambridge
>> > Email: wjt27 at cam.ac.uk
>> > Tel: (+)44 1223 763517
>> 
>> 
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Lustre Technical Lead
>> Oracle Corporation Canada Inc.
>> 
>> 
>> 
>> 
>> -- 
>> Wojciech Turek
>> 
>> Senior System Architect
>> 
>> High Performance Computing Service
>> University of Cambridge
>> Email: wjt27 at cam.ac.uk
>> Tel: (+)44 1223 763517 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20101020/83b1540b/attachment.htm>


More information about the lustre-discuss mailing list