[Lustre-discuss] Lustre1.6.6 manual failover problem

Thu Dec 18 10:29:21 PST 2008

On Thu, 2008-12-18 at 14:45 +0800, Lin Wang wrote:
> 
> [root at mds1 ~]# mkfs.lustre --fsname=testfs --mdt --mgs
> --failnode=mds2 /dev/sdb
> [root at mds1 ~]# mkdir -p /mnt/mdt
> [root at mds1 ~]# mount -t lustre /dev/sdb /mnt/mdt
> [root at mds2 ~]# mkdir -p /mnt/mdt
> [root at mds2 ~]# mount -t lustre /dev/sdb /mnt/mdt

You cannot do this.  By mounting the MDT on *both* MDSes at the same
time, you are corrupting it!  Only one MDS can have the MDT mounted at a
time.  I think you need to review the Failover chapter of the manual
again.

I'm surprised in fact that MMP allowed you to do this.  It should not
have.

> two OSS have share storage
> [root at oss1 ~]mkfs.lustre --fsname=testfs --ost --failnode=oss2
> --mgsnode=mds1 --mgsnode=mds2 /dev/sdb
> [root at oss1 ~]# mkdir -p /mnt/ost
> [root at oss1 ~]# mount -t lustre /dev/sdb /mnt/ost
> [root at oss2 ~]# mkdir -p /mnt/ost
> [root at oss2 ~]# mount -t lustre /dev/sdb /mnt/ost

Ditto here.  Two OSSes cannot mount the same OST at the same time.
Corruption!

> Now, shutdown the mds1 .the client don't use the command "df" list the
> disk use .And
>  shutdown the oss1 or oss3 the same problem.
> I want ask : how manual failover under the Lustre1.6.6  
> I read the Lustre1.6 manual ,the failover need the heartbeat software
> and so on ,but should have manual failover?

You need to do manually what heartbeat would do, which is first kill the
power of a dead node and then mount the resource (MDT or OST(s)).

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20081218/145d44ba/attachment.pgp>