[Lustre-discuss] Lustre Thumper Fault Tolerance

Mon Mar 10 19:56:33 PDT 2008

On Mar 10, 2008  15:48 -0700, Klaus Steden wrote:
> Is the mmp feature already in the existing Lustre distribution? If so, what
> versions are mmp-aware? If not, which version will be the first to
> incorporate it?

It's in Lustre 1.6.2+ and 1.4.12, and e2fsprogs-1.40.2+.

> On 3/10/08 2:15 PM, "Andreas Dilger" <adilger at sun.com>did etch on stone
> tablets:
> 
> > On Mar 10, 2008  09:09 -0600, Colin Faber wrote:
> >> Is this true even in the case of mounting the OSS as a read only node?
> > 
> > Yes, definitely even a "read only" mount can cause serious corruption.
> > There are several issues involved, the most dangerous is that even for
> > read-only mounting the journal is replayed by the kernel or otherwise
> > the filesystem may appear to be corrupted.
> > 
> > In addition, there is the problem that (meta)data that is cached on the
> > read-only mounting node will become incorrect as the writing node is
> > changing the filesystem.  The ext3 filesystem is not cluster aware.
> > 
> > In order to prevent situations like this, the newer releases of ldiskfs
> > and e2fsprogs have an "mmp" (multi-mount protection) feature which will
> > prevent the filesystem to be mounted on another node if it is active
> > on one node (either mounted, or running e2fsck).
> > 
> > This will be enabled by default on newly-formatted filesystems which
> > are created with the "--failover" flag, and can also be enabled by
> > hand with "tune2fs -O mmp /dev/XXXX" (replace with MDT or OST device
> > names as appropriate).  This will prevent the filesystem from being
> > mounted or e2fsck'd by old kernels/e2fsprogs so it isn't enabled by
> > default on existing filesystems.
> > 
> >> Andreas Dilger wrote:
> >>> On Mar 07, 2008  00:04 +0530, Neeladri Bose wrote:
> >>>   
> >>>> To address the performance hit (whatever be the %age) if we setup DRDB in
> >>>> active-passive mode across the 4500's but have the LustreFS points to
> >>>> separate raid sets from the network across the DRDB pair of 4500's & thus
> >>>> become an active-active solution which may actually increase the
> >>>> throughput of the LustreFS.
> >>>> 
> >>>> Can it be a possible scenario using DRDB on Linux with ext3 & LustreFS?
> >>>>     
> >>> 
> >>> No, Lustre does not support active-active export of backing filesystems.
> >>> This doesn't work because the backing filesystems (ext3/ZFS) are not
> >>> themselves cluster-aware and mounting them on two nodes will quickly
> >>> lead to whole filesystem corruption.
> > 
> > Cheers, Andreas
> > --
> > Andreas Dilger
> > Sr. Staff Engineer, Lustre Group
> > Sun Microsystems of Canada, Inc.
> > 
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.