[Lustre-discuss] failover software - heartbeat
Jim Garlick
garlick at llnl.gov
Tue Jul 14 11:38:50 PDT 2009
On Tue, Jul 14, 2009 at 09:37:54AM -0700, Cliff White wrote:
> Jim Garlick wrote:
> >Hi,
> >
> >OK I have posted it to https://*bugzilla.lustre.org/show_bug.cgi?id=20165
> >
> > 20165: scripts for heartbeat v1 integration
> >
> >I added example config files from our test cluster. Probably best to
> >redirect questions/comments/criticisms to the bug and I'll respond there.
>
> Looks very good, thanks bunches. I've added a few extras from the
> discussion. Did you guy try ipfail, or only pingd?
> cliffw
We tried ipfail (unsuccessfully), not pingd.
I think pingd is a v2 only feature? Our work is entirely with v1,
which seemed adeqate and also much simpler to understand and get right.
> >Jim
> >
> >
> >On Tue, Jul 14, 2009 at 12:26:24PM +1000, Atul Vidwansa wrote:
> >>Hi Jim,
> >>
> >>It would be great if you can attach the scripts to a Lustre bugzilla bug.
> >>
> >>Cheers,
> >>_Atul
> >>
> >>Jim Garlick wrote:
> >>>We recently put heartbeat v1 in production and along the way
> >>>developed some admin scripts including heartbeat resource agent compliant
> >>>lustre init scripts, a script to initiate failover/failback and get
> >>>detailed
> >>>status, a powerman stonith interface, and various safeguards to ensure
> >>>MMP
> >>>is on, devices are present and usable, etc. before starting lustre.
> >>>
> >>>If this is of general interest I could post it to a bug for review.
> >>>
> >>>Jim
> >>>
> >>>On Mon, Jul 13, 2009 at 01:45:02PM -0600, Lundgren, Andrew wrote:
> >>>
> >>>>It is very difficult to find relevant documentation for heartbeat 1/2.
> >>>>I just finished configuring a heartbeat system and would not recommend
> >>>>it because of the documentation. (They seem to have removed portions
> >>>>the heartbeat documentation from the site.)
> >>>>Pacemaker is not a simple solution to configure either. I played
> >>>>briefly with the RH clustering software. It does not directly support
> >>>>any FS type other than the basic ext2/ext3, and wasn't happy with a
> >>>>lustre type.
> >>>>--
> >>>>Andrew
> >>>>
> >>>>
> >>>>>-----Original Message-----
> >>>>>From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-
> >>>>>bounces at lists.lustre.org] On Behalf Of Carlos Santana
> >>>>>Sent: Monday, July 13, 2009 11:42 AM
> >>>>>To: lustre-discuss at lists.lustre.org
> >>>>>Subject: [Lustre-discuss] failover software - heartbeat
> >>>>>
> >>>>>Howdy,
> >>>>>
> >>>>>The lustre manual recommends heartbeat for handling failover. The
> >>>>>pacemaker is successor of hearbeat version 2. So whats recommended -
> >>>>>should we be using pacemaker or stick to hearbeat?
> >>>>>
> >>>>>-
> >>>>>CS.
> >>>>>_______________________________________________
> >>>>>Lustre-discuss mailing list
> >>>>>Lustre-discuss at lists.lustre.org
> >>>>>http://***lists.lustre.org/mailman/listinfo/lustre-discuss
> >>>>>
> >>>>_______________________________________________
> >>>>Lustre-discuss mailing list
> >>>>Lustre-discuss at lists.lustre.org
> >>>>http://***lists.lustre.org/mailman/listinfo/lustre-discuss
> >>>>
> >>>_______________________________________________
> >>>Lustre-discuss mailing list
> >>>Lustre-discuss at lists.lustre.org
> >>>http://**lists.lustre.org/mailman/listinfo/lustre-discuss
> >>>
> >>
> >>--
> >>==================================
> >>Atul Vidwansa
> >>Sun Microsystems Australia Pty Ltd
> >>Web: http://**blogs.sun.com/atulvid
> >>Email: Atul.Vidwansa at Sun.COM
> >>
> >_______________________________________________
> >Lustre-discuss mailing list
> >Lustre-discuss at lists.lustre.org
> >http://*lists.lustre.org/mailman/listinfo/lustre-discuss
>
More information about the lustre-discuss
mailing list