[Lustre-discuss] failover software - heartbeat

Lundgren, Andrew Andrew.Lundgren at Level3.com
Tue Jul 14 11:41:29 PDT 2009


I tried the pingd in v2, and was unable to get it working.  

(I also tried it in pacemaker and ended up opening a ticket that we haven't finished working though yet.)

The fiber ping is only around as a v1 tool as far as I can tell. Since I wasn't able to get normal ping to function, I never even tried the FC stuff.

> -----Original Message-----
> From: Jim Garlick [mailto:garlick at llnl.gov]
> Sent: Tuesday, July 14, 2009 12:39 PM
> To: Cliff White
> Cc: Atul Vidwansa; Lundgren, Andrew; lustre-discuss at lists.lustre.org
> Subject: Re: [Lustre-discuss] failover software - heartbeat
> 
> On Tue, Jul 14, 2009 at 09:37:54AM -0700, Cliff White wrote:
> > Jim Garlick wrote:
> > >Hi,
> > >
> > >OK I have posted it to
> https://*bugzilla.lustre.org/show_bug.cgi?id=20165
> > >
> > >  20165: scripts for heartbeat v1 integration
> > >
> > >I added example config files from our test cluster.  Probably best
> to
> > >redirect questions/comments/criticisms to the bug and I'll respond
> there.
> >
> > Looks very good, thanks bunches. I've added a few extras from the
> > discussion. Did you guy try ipfail, or only pingd?
> > cliffw
> 
> We tried ipfail (unsuccessfully), not pingd.
> I think pingd is a v2 only feature?  Our work is entirely with v1,
> which seemed adeqate and also much simpler to understand and get right.
> 
> > >Jim
> > >
> > >
> > >On Tue, Jul 14, 2009 at 12:26:24PM +1000, Atul Vidwansa wrote:
> > >>Hi Jim,
> > >>
> > >>It would be great if you can attach the scripts to a Lustre
> bugzilla bug.
> > >>
> > >>Cheers,
> > >>_Atul
> > >>
> > >>Jim Garlick wrote:
> > >>>We recently put heartbeat v1 in production and along the way
> > >>>developed some admin scripts including heartbeat resource agent
> compliant
> > >>>lustre init scripts, a script to initiate failover/failback and
> get
> > >>>detailed
> > >>>status, a powerman stonith interface, and various safeguards to
> ensure
> > >>>MMP
> > >>>is on, devices are present and usable, etc. before starting
> lustre.
> > >>>
> > >>>If this is of general interest I could post it to a bug for
> review.
> > >>>
> > >>>Jim
> > >>>
> > >>>On Mon, Jul 13, 2009 at 01:45:02PM -0600, Lundgren, Andrew wrote:
> > >>>
> > >>>>It is very difficult to find relevant documentation for heartbeat
> 1/2.
> > >>>>I just finished configuring a heartbeat system and would not
> recommend
> > >>>>it because of the documentation.  (They seem to have removed
> portions
> > >>>>the heartbeat documentation from the site.)
> > >>>>Pacemaker is not a simple solution to configure either. I played
> > >>>>briefly with the RH clustering software.  It does not directly
> support
> > >>>>any FS type other than the basic ext2/ext3, and wasn't happy with
> a
> > >>>>lustre type.
> > >>>>--
> > >>>>Andrew
> > >>>>
> > >>>>
> > >>>>>-----Original Message-----
> > >>>>>From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-
> discuss-
> > >>>>>bounces at lists.lustre.org] On Behalf Of Carlos Santana
> > >>>>>Sent: Monday, July 13, 2009 11:42 AM
> > >>>>>To: lustre-discuss at lists.lustre.org
> > >>>>>Subject: [Lustre-discuss] failover software - heartbeat
> > >>>>>
> > >>>>>Howdy,
> > >>>>>
> > >>>>>The lustre manual recommends heartbeat for handling failover.
> The
> > >>>>>pacemaker is successor of hearbeat version 2. So whats
> recommended -
> > >>>>>should we be using pacemaker or stick to hearbeat?
> > >>>>>
> > >>>>>-
> > >>>>>CS.
> > >>>>>_______________________________________________
> > >>>>>Lustre-discuss mailing list
> > >>>>>Lustre-discuss at lists.lustre.org
> > >>>>>http://***lists.lustre.org/mailman/listinfo/lustre-discuss
> > >>>>>
> > >>>>_______________________________________________
> > >>>>Lustre-discuss mailing list
> > >>>>Lustre-discuss at lists.lustre.org
> > >>>>http://***lists.lustre.org/mailman/listinfo/lustre-discuss
> > >>>>
> > >>>_______________________________________________
> > >>>Lustre-discuss mailing list
> > >>>Lustre-discuss at lists.lustre.org
> > >>>http://**lists.lustre.org/mailman/listinfo/lustre-discuss
> > >>>
> > >>
> > >>--
> > >>==================================
> > >>Atul Vidwansa
> > >>Sun Microsystems Australia Pty Ltd
> > >>Web: http://**blogs.sun.com/atulvid
> > >>Email: Atul.Vidwansa at Sun.COM
> > >>
> > >_______________________________________________
> > >Lustre-discuss mailing list
> > >Lustre-discuss at lists.lustre.org
> > >http://*lists.lustre.org/mailman/listinfo/lustre-discuss
> >



More information about the lustre-discuss mailing list