[Lustre-discuss] failover software - heartbeat

Jim Garlick garlick at llnl.gov
Tue Jul 14 11:38:50 PDT 2009


On Tue, Jul 14, 2009 at 09:37:54AM -0700, Cliff White wrote:
> Jim Garlick wrote:
> >Hi,
> >
> >OK I have posted it to https://*bugzilla.lustre.org/show_bug.cgi?id=20165
> >
> >  20165: scripts for heartbeat v1 integration
> >
> >I added example config files from our test cluster.  Probably best to
> >redirect questions/comments/criticisms to the bug and I'll respond there.
> 
> Looks very good, thanks bunches. I've added a few extras from the 
> discussion. Did you guy try ipfail, or only pingd?
> cliffw

We tried ipfail (unsuccessfully), not pingd.
I think pingd is a v2 only feature?  Our work is entirely with v1,
which seemed adeqate and also much simpler to understand and get right.

> >Jim
> >
> >
> >On Tue, Jul 14, 2009 at 12:26:24PM +1000, Atul Vidwansa wrote:
> >>Hi Jim,
> >>
> >>It would be great if you can attach the scripts to a Lustre bugzilla bug.
> >>
> >>Cheers,
> >>_Atul
> >>
> >>Jim Garlick wrote:
> >>>We recently put heartbeat v1 in production and along the way
> >>>developed some admin scripts including heartbeat resource agent compliant
> >>>lustre init scripts, a script to initiate failover/failback and get 
> >>>detailed
> >>>status, a powerman stonith interface, and various safeguards to ensure 
> >>>MMP
> >>>is on, devices are present and usable, etc. before starting lustre.
> >>>
> >>>If this is of general interest I could post it to a bug for review.
> >>>
> >>>Jim
> >>>
> >>>On Mon, Jul 13, 2009 at 01:45:02PM -0600, Lundgren, Andrew wrote:
> >>> 
> >>>>It is very difficult to find relevant documentation for heartbeat 1/2. 
> >>>>I just finished configuring a heartbeat system and would not recommend 
> >>>>it because of the documentation.  (They seem to have removed portions 
> >>>>the heartbeat documentation from the site.)  
> >>>>Pacemaker is not a simple solution to configure either. I played 
> >>>>briefly with the RH clustering software.  It does not directly support 
> >>>>any FS type other than the basic ext2/ext3, and wasn't happy with a 
> >>>>lustre type. 
> >>>>--
> >>>>Andrew
> >>>>
> >>>>   
> >>>>>-----Original Message-----
> >>>>>From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-
> >>>>>bounces at lists.lustre.org] On Behalf Of Carlos Santana
> >>>>>Sent: Monday, July 13, 2009 11:42 AM
> >>>>>To: lustre-discuss at lists.lustre.org
> >>>>>Subject: [Lustre-discuss] failover software - heartbeat
> >>>>>
> >>>>>Howdy,
> >>>>>
> >>>>>The lustre manual recommends heartbeat for handling failover. The
> >>>>>pacemaker is successor of hearbeat version 2. So whats recommended -
> >>>>>should we be using pacemaker or stick to hearbeat?
> >>>>>
> >>>>>-
> >>>>>CS.
> >>>>>_______________________________________________
> >>>>>Lustre-discuss mailing list
> >>>>>Lustre-discuss at lists.lustre.org
> >>>>>http://***lists.lustre.org/mailman/listinfo/lustre-discuss
> >>>>>     
> >>>>_______________________________________________
> >>>>Lustre-discuss mailing list
> >>>>Lustre-discuss at lists.lustre.org
> >>>>http://***lists.lustre.org/mailman/listinfo/lustre-discuss
> >>>>   
> >>>_______________________________________________
> >>>Lustre-discuss mailing list
> >>>Lustre-discuss at lists.lustre.org
> >>>http://**lists.lustre.org/mailman/listinfo/lustre-discuss
> >>> 
> >>
> >>-- 
> >>==================================
> >>Atul Vidwansa
> >>Sun Microsystems Australia Pty Ltd
> >>Web: http://**blogs.sun.com/atulvid
> >>Email: Atul.Vidwansa at Sun.COM
> >>
> >_______________________________________________
> >Lustre-discuss mailing list
> >Lustre-discuss at lists.lustre.org
> >http://*lists.lustre.org/mailman/listinfo/lustre-discuss
> 



More information about the lustre-discuss mailing list