[Lustre-discuss] Recovery from Hardware Failure
Cliff White
cliffw at whamcloud.com
Mon Feb 7 15:00:47 PST 2011
You should not have to do the lfsck if the initial fsck's come back clean.
cliffw
On Mon, Feb 7, 2011 at 1:16 PM, Joe Digilio <jgd-lustre at metajoe.com> wrote:
> Last week we experienced a major hardware failure (disk controller)
> that brought down our system hard. Now that I have the replacement
> controller, I want to make sure I recover correctly. Below is the
> procedure I plan to follow based on what I've gathered from the
> Operations Manual.
>
> Any comments?
> Do I need to create the mds/ost DBs AFTER ll_recover_lost_found_objs?
>
> Thanks!
> -Joe
>
>
> ###MDT Recovery
> # Capture fs state before doing anything
> e2fsck -vfn /dev/$MDTDEV
> # "safe" repair
> e2fsck -vfp /dev/$MDTDEV
> # Verify no more problems and generate mdsdb
> e2fsck -vfn --mdsdb /tmp/mdsdb /dev/$MDTDEV
>
> ###OST Recovery
> foreach OST
> # Capture fs state before doing anything
> e2fsck -vfn /dev/$OSTDEV
> # "safe" repair
> e2fsck -vfp /dev/$OSTDEV
> # Verify no more problems
> e2fsck -vfn --mdsdb /tmp/mdsdb --ostdb /tmp/ostXdb /dev/$OSTDEV
>
> ### Recover lost+found Objects
> foreach OST
> mount -t ldiskfs /dev/$OSTDEV /mnt/ost
> ll_recover_lost_found_objs -v -d /mnt/ost/lost+found
>
> ### Coherency Check
> lfsck -n -v --mdsdb /tmp/mdsdb --ostdb
> /tmp/ost1db,/tmp/ost2db,...,/tmp/ostNdb /lustre
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20110207/ebaa95ff/attachment.htm>
More information about the lustre-discuss
mailing list