[Lustre-discuss] OST redundancy between nodes?

Fri Jun 26 11:15:14 PDT 2009

On Fri, Jun 26, 2009 at 12:51 PM, Kevin Van Maren<Kevin.Vanmaren at sun.com> wrote:
> OSS is the server.  It normally provides one or more OSTs.
>
> OST failover is done by configuring multiple OSS nodes to be able to serve
> the same OST.  Only ONE OSS node may provide the OST at a time.
>
I understand that OST can't be shared by two or more active OSSs at a
time. But we can/should configure OSSs for failover mode. In my
interpretation OST failure was a disk/storage failure. So the failover
you are referring to was an OSS failover in my understanding (i.e.,
switch to another failover OSS node, if particular OSS fails) .

> Failover is accomplished by the clients attempting to connect to each OSS
> node configured to serve the OST, until one of them responds with it active.
>
>
> An OST can be moved back-and-forth between OSS nodes by umount/mount
> commands (assuming both servers can access the same disk!)
>
> If an OST "fails", meaning that the underlying HW has failed (or the
> connection to the storage has failed -- one reason to use multipath IO),
> then Lustre will return IO errors to the application (although there is an
> RFE to not do that).  Normally what happens is the OSS _node_ fails, and the
> other node mounts the OST (typically done by using Linux-HA/Heartbeat).
>

Yeah, this is what I am curious abt - OST/disk/storage-device failure.
It might be nice to have something on wiki regarding server and target
as separate entities or same machine. I have gone through the FAQ
entry, but it would be great if we could elaborate it further.

>
> MDS/MDT failover/configuration is similar.
>
> Kevin
>
>
>
> Carlos Santana wrote:
>>
>> Sorry, but may be I am confused between OSS and OST.
>>
>> On Fri, Jun 26, 2009 at 11:24 AM, Brian J. Murrell<Brian.Murrell at sun.com>
>> wrote:
>>
>>>
>>> On Fri, 2009-06-26 at 10:56 -0500, Carlos Santana wrote:
>>>
>>>>
>>>> I was wondering what will happen during OST failure
>>>>  - if client is making some read/write operation
>>>>
>>>
>>> Assuming the OST is configured for failover, the client will retry
>>> anything that didn't get committed to disk before the OST failure.  It
>>> will try with all available failover targets for the OST.
>>>
>>
>> Can OST(disk) be configured for failover like an OSS(server node)?
>>
>>
>>>>
>>>> - if client requests read/write after OST fails
>>>>
>>>
>>> Same as above.
>>>
>>>
>>>>
>>>> When I made OSS unavailable the client waited/got delayed response
>>>> till OSS connected back.
>>>>
>>>
>>> Right.  That's failover.
>>>
>>>
>>>>
>>>> I am not sure about OST failure though. Any
>>>> clues?
>>>>
>>>
>>> An OST fails if an OSS fails given that an OST is the disk in an OSS
>>> (which is the node).
>>>
>>
>> I thought an OST(disk) can fail without OSS(server) being failed.
>> And that's my question, what will happen in such scenario - while
>> client is in read/write operation and client requesting read/write
>> after the OST(disk) failure?
>>
>>
>>>
>>> b.
>>>
>>>
>
>