[Lustre-discuss] how to define 60 failnodes

Brian J. Murrell Brian.Murrell at Sun.COM
Mon Nov 9 07:31:19 PST 2009


On Mon, 2009-11-09 at 16:25 +0800, lelustre wrote: 
>     I have 60 nodes to use as OSS ,I have made an experiment : I use
> a disk(iscsi) to be an OST , if I do not define the failnodes when
> I use the mkfs.luster command, I mount this ost and at the client
> node ,lfs df -h can see this OST, but when I umount it and mount the
> ost to another OSS ,
> lfs df -h can not see it again.

Right.  Because you have to tell the client which other nodes might make
that OST available so that it can find the one actually making it
available.  If you don't give the client any alternate nodes, it doesn't
know other nodes and doesn't try any but the one node the OST was
configured on.

> But if I define the failnodes in the mkfs.luster command, and do the
> operations above, we can see the OST at the client node using lfs df
> -h command. 

Right.

>     So my question is :if I want an OST to failover to any OSS (one of
> sixty nodes),should I need to defined 60 failnodes when I format the
> disk?

Theoretically.  I had discussed this briefly with another engineer a
while ago and IIRC, the result of the discussion was that there was
nothing inherent in the configuration logic that would prevent one from
having more than two ("primary" and "failover") OSSes providing service
to an OST.  Two nodes per OST is how just about everyone that wants
failover configures Lustre.

I'm not really sure that 60 nodes for every OST is really practical
though.  When an OSS does fail, the process of finding the OST on a
failover node is serial and linear.  That is, when the OSS providing an
OST dies, the client cycles through the OST's failover list trying each
OSS, serially, until it finds the OST.  The time given to each discovery
attempt is not trivial (i.e. a few seconds or less) so hunting through
60 of them will take considerable time.

> or can I use pacemaker to select an oss and modify something to notify
> client that the disk is on some OSS?

No.  There is currently no way to push a client towards an OSS for a
given OST.

b.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20091109/317f8ca1/attachment.pgp>


More information about the lustre-discuss mailing list