[Lustre-discuss] Implementing MMP correctly

Tue Dec 22 14:21:43 PST 2009

Michael,

to answer your question on the pacemaker mailing list, if you do use the an 
agent that also checks for all umount bugs, it might work without mmp, but you 
still remove a very useful protection. And the situation didn't change since 
October when you asked a similar question last time ;)

On Tuesday 22 December 2009, Jim Garlick wrote:
> On Tue, Dec 22, 2009 at 02:12:44PM +0100, Michael Schwartzkopff wrote:
> > Hi,
> >
> > I am trying to understand howto implement MMP correctly into a lustre
> > failover cluster.
> >
> > As far as I understood the MMP protects the same filesystem beeing
> > mounted by different nodes (OSS) of a failover cluster. So far so good.
> >
> > If a node was shut down uncleanly it still will occupy its filesystems by
> > MMP and thus preventing the clean failover to an other node.

How did you get this idea at all?

> 
> Hi, ldiskfs (or e2fsck) will poll the MMP block to see if the other side
> is still updating it before starting.  If updates have ceased, the mount
> or fsck will start.  So the workarounds below are unnecessary.
> 
> > Now I want to
> > implement a clean failover into the Filesystem Resource Agent of
> > pacemaker. Is there a good way to solve the problem with MMP? Possible
> > sotutions are:
> >
> > - Disable the MMP feature in a cluster at all, since the resource manager
> > takes care that the same resource is only mounted once in the cluster
> >
> > - Do a "tunefs -O ^mmp <device>" and a "tunefs -O mmp <device>" before
> > every mounting of a resource?
> 
> tune2fs -Eclear-mmp is a faster alternative.

Doing that for each and every mount would basically remove the MMP protection. 
I think Michael wants to write an agent that does that automatically...

And again, I submitted and updated a suitable agent in Lustre bugzilla 20807. 
It is almost ready to be submitted to heartbeat/pacemaker, I only need to 
clean up some comments and slightly simplify some umount checks.

> 
> Should only be necessary if e2fsck is interrupted.
> (e2fsck does not regularly update the MMP block like the file system does)
> 
> > - Do a "sleep 10" before mounting a resource? But the manual says "the
> > file system mount require additional time if the file system was not
> > cleanly unmounted."

It will require more time for a journal replay, I guess.

> >
> > - Check if the file system is in use by another OSS through MMP and wait
> > a litte bit longer? How do I do this?

Not necessary. All DDN Lustre installations in Europe are now based on 
pacemaker, without any ugly workarounds. But then as I told you before, our 
releases also fix bug 19566 already.

-- 
Bernd Schubert
DataDirect Networks