[Lustre-discuss] Heartbeat, LVM and Lustre

Jim Garlick garlick at llnl.gov
Thu Dec 10 16:07:19 PST 2009


On Thu, Dec 10, 2009 at 02:35:38PM -0800, Adam Gandelman wrote:
> Brian J. Murrell wrote:
> > Indeed.  That is my understanding too, and further, if heartbeat finds a
> > resource already running on a node on which it's trying to start it
> > stops and throws it's hands up.  When the O/S is starting LVM, both
> > nodes end up doing that.
> >   
> One of many limitations in heartbeat v1 clusters.  Pacemaker and, IIRC,
> heartbeat2 crm will attempt to stop an overactive resource when it
> notices it is active somewhere it shouldn't be.  Also, if a stop request
> fails to clear up the confusion and ensure it is running on just one
> node: STONITH.  V1 falls short in this case, too.  hb2/pacemaker HA
> clusters do more to avoid throwing its hands up and accepting no
> availability.   All the more reason why anyone going HA in production
> should brave the learning curve and update to something current.

Here we have a configuration problem and the right thing is probably
to throw up hands and make somebody fix it.  It could be dangerous to
have LVM start on both nodes, run for a while exposed to races, then
have heartbeat shut down one side and "just work".

But I see your point.

> -- 
> : Adam Gandelman
> : LINBIT | Your Way to High Availability
> : Telephone: 503-573-1262 ext. 203
> : Sales: 1-877-4-LINBIT / 1-877-454-6248
> :
> : 7959 SW Cirrus Dr.
> : Beaverton, OR 97008
> :
> : http://*www.*linbit.com 	  	
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://*lists.lustre.org/mailman/listinfo/lustre-discuss



More information about the lustre-discuss mailing list