[Lustre-discuss] failover software - heartbeat (Lundgren, Andrew)
Daniel Kulinski
dank at weinmangeoscience.com
Mon Jul 13 13:50:58 PDT 2009
Andrew,
I was able to get the ipfail to work on my heartbeat 2.1.3 installation.
Make sure the following line is uncommented in /etc/ha.d/ha.cf:
respawn hacluster /usr/lib64/heartbeat/ipfail
And corresponding with that you must have a ping line with each host
separated by a space.
We have tested this and it works perfectly. We have 3 ethernet networks to
each OSS and MDS pair.
I have no idea on what pingd is or how it relates to heartbeat.
Dan Kulinski
>
>Were you able to get monitoring working to detect network failures?
(pingd?)
>
>I have it configured, but haven't been able to get it to trigger a failover
when an MDS cannot ping the network. (I tried with 1.0 and 2.0 conf files,
I am currently >using 2.0) I have a ticket open with the pacemaker project
(no ticket system for the HA stuff...)
>but not resolution. I am considering writing a script to down the node
when the ping fails, but don't like the idea.
>
>I would also like to get the hpingd functioning to detect a fiber failure,
but there was less available on that solution.
>
>--
>Andrew
More information about the lustre-discuss
mailing list