[Lustre-discuss] Network aliasing and HA

Timh Bergström timh.bergstrom at diino.net
Thu Sep 25 21:59:18 PDT 2008


Hi Klaus,

Thanks, the linux-ha setup is fairly complete I think, problem is the
lustre timeout - the clients does not  try the "other" fast enough.
That's kind of the lustre timeout(s) I want to change along with some
recovery options.

OT: Where can I read more about the "recovery" in Lustre, i've heard
words like replay/recovery in some discussions here and im not sure I
know what theese really mean 100% (from a Lustre point of view, the
words are crystal clear ;-)). It seems like my manual is to old.

Regards,
Timh

2008/9/25 Klaus Steden <klaus.steden at thomson.net>:
>
> Hi Timh,
>
> If you're using Linux-HA, you can configure how quickly failover takes
> place. I have mine set to 90 seconds before the primary is marked dead and
> the secondary takes over.
>
> When this occurs, any Lustre transactions not yet in flight will block until
> the ones that were in progress at the time of the failure have either had a
> chance to complete or have timed out.
>
> I'm not sure how to modify Lustre-specific settings for recovery time,
> though.
>
> cheers,
> Klaus
>
>
> On 9/25/08 1:54 PM, "Timh Bergström" <timh.bergstrom at diino.net>did etch on
> stone tablets:
>
>> To follow up on this matter, i've currently set ha/drbd as suggested,
>> formatted the ost's with double mgsserver directives and also mounted
>> with double addresses on the clients, as ip1 at tcp0:ip2 at tcp1:/fsname -
>> though, if i fail mgs/mdt 1 it does not recover (in a resonable time),
>> what kinds of tuning/settings will affect this?
>>
>> //Timh
>>
>> 2008/9/23 Timh Bergström <timh.bergstrom at diino.net>:
>>> Thank you, that's the path i've taken from the last message on this
>>> list, as I misunderstood some of the drbd/ha setups before. However,
>>> using 4 mgsnode "paths", is that recommended or should I use one
>>> mgspath per node and use the other as some sort of manual failover?
>>>
>>> Regards,
>>> Timh
>>>
>>> 2008/9/23 Kevin Van Maren <Kevin.Vanmaren at sun.com>:
>>>> Note that you do not normally use IP takeover with Lustre/Heartbeat: you set
>>>> the failover IP addresses with the mkfs.lustre command, and Lustre
>>>> reconnects to the _other_ address when it is disconnected.
>>>>
>>>> In your case, you would have 2 fixed addresses for each node (w/o heartbeat
>>>> - do NOT use the heartbeat virtual IP addresses), and specify both those
>>>> failover NIDs (rather than just 1).
>>>>
>>>> Lustre1.6 is a bit different from a lot of HA/Heartbeat users: Lustre
>>>> _knows_ about the multiple paths/addresses, and simply requires Heartbeat to
>>>> ensure it is mounted on exactly one node in the failover pair: it does NOT
>>>> rely on IP takeover for HA.
>>>>
>>>> Kevin Van Maren
>>>>
>>>>
>>>> Timh Bergström wrote:
>>>>>
>>>>> 2008/9/23 Brian J. Murrell <Brian.Murrell at sun.com>:
>>>>>
>>>>>>
>>>>>> On Tue, 2008-09-23 at 15:06 +0200, Timh Bergström wrote:
>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>
>>>>> Hi again, and thanks for the quick reply!
>>>>>
>>>>>
>>>>>>>
>>>>>>> My (current) modprobe:
>>>>>>>
>>>>>>> options lnet networks=tcp0(eth0)10.4.21.50,tcp1(eth1)10.4.22.50
>>>>>>>
>>>>>>
>>>>>> This syntax is incorrect.  For some examples of multi-homed
>>>>>> configurations see the manual at
>>>>>>
>>>>>> http://manual.lustre.org/manual/LustreManual16_HTML/MoreComplicatedConfigu
>>>>>> rations.html#50642998_20213
>>>>>>
>>>>>
>>>>> Yes that's the link i've been consulting, perhaps im not looking hard
>>>>> enough.
>>>>>
>>>>>
>>>>>>>
>>>>>>> This is the errors i get:
>>>>>>> LustreError: 10f-e: Error parsing
>>>>>>> 'networks="tcp0(eth0)10.4.21.50,tcp1(eth1)10.4.22.50"'
>>>>>>>
>>>>>>
>>>>>> When you specify "networks" because you specify the interfaces to use,
>>>>>> you don't need to specify the ip address.  I think you are confusing the
>>>>>> networks and ipnets options.
>>>>>>
>>>>>
>>>>> The problem here exactly is that the physical interfaces is there, but
>>>>> not with the ip-addresses i want the mdt to "listen" on - the "NIDs",
>>>>> they are added later through heartbeat as aliases (IPaddr2::10.4.21.50
>>>>> IPaddr2::10.4.22.50), but before mounting the mdt-resource (drbd).
>>>>>
>>>>>
>>>>>>>
>>>>>>> LustreError: 110-0: here...............................|---------|
>>>>>>> LustreError: 4527:0:(events.c:707:ptlrpc_init_portals()) network
>>>>>>> initialisation failed
>>>>>>> (along with a bunch of errors since this module does not load)
>>>>>>>      I've tried with tcp0(eth0:0) which fails with about the same error,
>>>>>>> i've tried tcp0(eth0,eth1) which gives me the wrong addresses (machine
>>>>>>> ones) but works.
>>>>>>>
>>>>>>
>>>>>> What is the topology exactly?  Are there two nics or one nic with two
>>>>>> addresses?  Are the two nics on the same physical network or separate
>>>>>> physical networks?
>>>>>>
>>>>>
>>>>> eth0 and eth1 are physical interfaces, they have statically assigned
>>>>> ip's (for management, supervision etc), heartbeat then adds addresses
>>>>> to theese two interfaces if the node is "primary".
>>>>>
>>>>> If it matters - eth0 and eth1 has separated physical paths to
>>>>> everything, this is because we want to survive a physical fail on the
>>>>> network before failing over to another physical server.
>>>>>
>>>>> As I read the manual, i format my OST's with more than one --mgsnode
>>>>> option, which in turn will make the OST "know" about both path's to
>>>>> the MDS/MGS server(s). As in, if first MGS does not work (physical
>>>>> network failure on side A) - try second (Physical side B).
>>>>>
>>>>> What we healthcheck on is the data/disks/server hardware which will
>>>>> tell heartbeat to fail over to server 2 which takes over network path
>>>>> A and network path B (on 10.4.[21,22].50), and the OST's/clients
>>>>> should continue working without noticing.
>>>>>
>>>>>
>>>>>>
>>>>>> b.
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Lustre-discuss mailing list
>>>>>> Lustre-discuss at lists.lustre.org
>>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Timh Bergström
>>> System Administrator
>>> Diino AB - www.diino.com
>>> :wq
>>>
>>
>>
>
>



-- 
Timh Bergström
System Administrator
Diino AB - www.diino.com
:wq



More information about the lustre-discuss mailing list