[Lustre-discuss] Lustre DRBD failover time

tao.a.wu at nokia.com tao.a.wu at nokia.com
Tue Jul 14 12:05:46 PDT 2009


Yes, it is the latter... Thanks for the info.

A related but different question,  Lustre 2.0 will have replication.  Under 2.0 (with replication), what would happen if the primary node goes down?  Would the backup node be able to take over the load in shorter period of time?  Or is the replication feature for something else?

Thanks,

-Tao

-----Original Message-----
From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of ext Brian J. Murrell
Sent: Tuesday, July 14, 2009 12:10 PM
To: lustre-discuss at lists.lustre.org
Subject: Re: [Lustre-discuss] Lustre DRBD failover time

On Tue, 2009-07-14 at 17:54 +0200, tao.a.wu at nokia.com wrote:
>  
> Hi, all,
>  
> I am evaluating Lustre with DRBD failover, and experiencing about 2 
> minutes in OSS failover time to switch to the secondary node.

What is this 2 minutes including?  Just the time for the second OSS to mount the disk and start recovery or is it 2 minutes to detect the primary failure and have the secondary complete recovery so that the clients are fully functional again?

If the latter, then you are doing quite well.  Recovery is not an instantaneous process.  Much work needs to be done to ensure coherency between what is on the disk of the failed over OST and what the clients think is on disk.  Getting to this state requires that all clients synchronize with the OST and getting/waiting for many clients to do this can, currently, take many minutes as each client has to first notice the primary is dead and sync up with the failover.  Some clients might not even be available to sync, in which case you have to wait for a timeout.

So if you are talking 2 minutes from failure to full recovery, you are not likely going to put much of a dent in this.

Lustre 1.8 has adaptive timeouts enabled and that should help in optimal situations, but it will still take time to do a full recovery.

b.





More information about the lustre-discuss mailing list