[Lustre-discuss] Re : OST redundency issues

Mon Jan 28 08:15:13 PST 2008

On Sat, 2008-01-26 at 14:12 +0200, Mailer PH wrote:
> After watching the system logs for a while iv notices a very strange
> thing , when the OST fail to the other node
> it register itslef as new OST with new ID e.g the original OST id was
> OST0000 then it mounts as OST0003

This is wrong.  An OST's UUID should not change in response to a
failover event.  It sounds like you don't really have a (drbd) mirror of
the OST on both nodes, or that you are using the wrong drbd device in
your configurations.

> This behaviour is very strange seens when you format the OST on drbd
> volume it replicates its structure to its pair so when its pair take
> over
> the system see it as the same drive much like a shared storage .

That's how drbd is supposed to work and is why when the failover node
takes over the device, the UUID *should* be the same.  That it's not is
the problem you need to get to the root of.

> Is my confuguration even possible ? does anybody run into same
> issues ?

What you are trying to accomplish is indeed possible.  It sounds like
you are just not there yet.

I don't really know much about the particular implementation of drbd so
you are going to have to do some reading or get help from the drbd users
lists, but I'd suggest you debug drbd first, and confirm absolutely that
when you fail a node, the drbd volume on the other node is indeed the
same as the failed node.

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20080128/c2c277c5/attachment.pgp>