[Lustre-discuss] Confusion with failover

Klaus Steden klaus.steden at thomson.net
Fri Jun 27 12:17:27 PDT 2008





On 6/26/08 9:16 PM, "Dhruv" <DhruvDesaai at gmail.com>did etch on stone
tablets:

> Actually with my kernel of 2.6.9-22,  lustre 1.4.5.1 fits. And i am
> not in position to change the OS itself.
> 
> I tried with the failover of OSTs without Linux HA. It worked fairly.
> I am now testing the same rigoursly to see whether i am correct. But
> the failover of MDS without HA didnt worked atall.
> 
> Can it without HA?
> 
No. As Brian pointed out, Lustre supports failover at the server level, but
detection, fencing, etc. has to be handled by another process external to
Lustre. Most people use Linux-HA, including myself, and I find it to be
robust and fairly straightforward to implement. However, because you're
using 1.4, you might have to resort to some "script-fu" to get the
remounting operation to work properly.

Here is a paste of my /etc/ha.d/haresources file, which for Lustre 1.6 can
be used with the Linux 'mount' command, meaning I can treat my Lustre MDT as
a regular disk, which HA supports very well. If you use lconf, you'll have
to make some sort of script-based call-out to have the secondary MDS start
when it detects failure on the primary.

-- cut --
[root at mds-0-0 ~]# cat /etc/ha.d/haresources
mds-0-0.local 172.16.2.252
Filesystem::-Llustre-MDT0000::/mnt/lustremdt::lustre
-- cut --

(that's supposed to be all one line ... stupid mail client)

cheers,
Klaus




More information about the lustre-discuss mailing list