[lustre-discuss] 2.1 MDT recovery on test hardware

Ben Evans bevans at cray.com
Fri Jun 26 07:09:16 PDT 2015

I think I see where you're going with this, but I think you need to cobble together a complete test system in order to do it.

1) So, set up a single MDS/(MDT + MGT), and an OSS/OST.

2) Mount a client, dump some data in (for what you're trying, you want to create lots of files, not lots of data)

3) Create MDS2 (give it its own IP address), and set up the lvm mirror as you described.  Wait for it to sync.

4) Use tunefs.lustre to add MDS2 as a failnode for the MDT

5) start creating some more files

6) power off MDS1, and mount the MDT on MDS2

Play around with the scratch systems a few times, reformat the whole thing, start over, etc. and figure out if you can do this ... failover? consistently, or if you need to halt the clients for everything to be successful.  Also figure out if you can unmounts and remount the whole thing.

-Ben Evans

-----Original Message-----
From: lustre-discuss [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Crowe, Tom
Sent: Thursday, June 25, 2015 6:49 PM
To: Christopher J. Morrone
Cc: lustre-discuss at lists.lustre.org
Subject: Re: [lustre-discuss] 2.1 MDT recovery on test hardware

Hi Chris,

Thanks for the response. Hopefully I can explain the scenario in greater detail and shed some light on this. 

The MDT is scheduled for replacement with newer/faster/better hardware. The DD based backup's historically take over 30 hours. LVM snapshots are used to avoid a terribly long outage, so the backup is actually of the LVM snap, and the filesystem is up during the "point in time" DD backup.

The thought I had, was to restore the DD backup, to test gear, wire up an new OST, mount a client, put a small load on the test gear. Then use LVM's pvmove, to migrate the block devices of the MDT, with the filesystem up, and said client continues to churn data. Basically simulate the MDT migration via pvmove on test gear.

The goal, is to accomplish the MDT migration, avoiding the estimated 40 hour outage of a DD backup/restore.

I have reviewed and followed the lustre 2.x manual, specifically the 14.5 section "changing a server NID", and all seems well except the client mount. I receive the following error when attempting to mount a client:

mount.lustre: mount at o2ib:/lustre at /lustre/client1 failed: No such file or directory Is the MGS specification correct?
Is the filesystem name correct?
If upgrading, is the copied client log valid? (see upgrade docs)

The client can lctl ping the MGS/MDT (same node) w/out and issue, and the MGS/MDT can lctl ping the client as well.

The overall test, is to validate the LVM pvmove process. I have migrated many other block based storage with a similar LVM procedure, but never a lustre MDT.

Ultimately, I may need to lobby for an extended outage, and simply incur the 40 hour downtime for the DD backup/restore. But I would really like to understand why the client won't mount. The test has dovetailed into a mini recovery exercise, which I feel is not complete unless a client can access the filesystem. 

Thank you for your comments. I am open to suggestions.


> On Jun 25, 2015, at 5:13 PM, Christopher J. Morrone <morrone2 at llnl.gov> wrote:
> I think the major problem is going to be that your MDT image is not terribly useful without the OSTs that belong to the MDT.  The new OSTs don't contain any of the objects that the MDT references.
> Back at old Lustre 2.1 code you won't have any of the lfsck code that can deal with the MDT to OST inconsistency problems.  And even if you did, every file in your filesystem would be removed or moved to lost+found.
> It is not immediately clear to me what amount of useful testing could be done under that situation.  Maybe there is something.
> Chris
> On 06/25/2015 01:06 PM, Crowe, Tom wrote:
>> Greetings,
>> I am investigating the possibility of restoring a DD backup of our 
>> MDT, onto test hardware. Our filesystem is 2.1 based.
>> The general idea would be to get the MDT/MGS restored in their 
>> entirety, change the MGSNODE parameter on the MDT to reflect the test 
>> hardware LNET setup, add some new OST's, have clients mount the new 
>> setup, and proceed with our testing.
>> Is there a procedure that outlines this process? I suspect the 
>> exercise could be considered a disaster recovery test, but we do not 
>> have any intention at this time to relocate and/or recover any of the original OST's.
>> Thank You.
>> -Tom Crowe
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

lustre-discuss mailing list
lustre-discuss at lists.lustre.org

More information about the lustre-discuss mailing list