[Lustre-discuss] Question about failnode

Brian J. Murrell Brian.Murrell at Sun.COM
Mon Oct 29 07:36:32 PDT 2007


On Mon, 2007-10-29 at 16:58 +0900, Kazuki Ohara wrote:
> 
> I can't find out the reason why MGS and OSS need to learn of the failover partner.

Because the MGS is the centre (i.e. configuration broker if you will) of
how a cluster is configured.  All nodes go to the MGS to get the cluster
configuration.

Strictly speaking a given OST does not need to know who it's failover
partner is, and the --failnode in the mkfs.lustre on the OST is not in
to inform the OST (it's running on) but that information is sent (i.e.
by mkfs.lustre) to the MGS so that the rest of the cluster (who do need
to know) can learn this.

> By that information, does MGS or OSS request the partner not to access the shared volume
> or something special requests?

No.  You must keep in mind that exclusive access to the shared media is
absolutely required.  I think you understand this, but it does bear
repeating.

You must never have both OSSes mount the same volume at the same time --
you will corrupt it.  Lustre itself does not take care of this mounting
and unmounting however.  This is left to the operating environment that
Lustre is running in -- Linux and can be achieved in Linux using
Heartbeat.  Please see the manual for more information on how Heartbeat
achieves this.

b.





More information about the lustre-discuss mailing list