[Lustre-discuss] Network aliasing and HA

Timh Bergström timh.bergstrom at diino.net
Tue Sep 23 09:15:59 PDT 2008


Thank you, that's the path i've taken from the last message on this
list, as I misunderstood some of the drbd/ha setups before. However,
using 4 mgsnode "paths", is that recommended or should I use one
mgspath per node and use the other as some sort of manual failover?

Regards,
Timh

2008/9/23 Kevin Van Maren <Kevin.Vanmaren at sun.com>:
> Note that you do not normally use IP takeover with Lustre/Heartbeat: you set
> the failover IP addresses with the mkfs.lustre command, and Lustre
> reconnects to the _other_ address when it is disconnected.
>
> In your case, you would have 2 fixed addresses for each node (w/o heartbeat
> - do NOT use the heartbeat virtual IP addresses), and specify both those
> failover NIDs (rather than just 1).
>
> Lustre1.6 is a bit different from a lot of HA/Heartbeat users: Lustre
> _knows_ about the multiple paths/addresses, and simply requires Heartbeat to
> ensure it is mounted on exactly one node in the failover pair: it does NOT
> rely on IP takeover for HA.
>
> Kevin Van Maren
>
>
> Timh Bergström wrote:
>>
>> 2008/9/23 Brian J. Murrell <Brian.Murrell at sun.com>:
>>
>>>
>>> On Tue, 2008-09-23 at 15:06 +0200, Timh Bergström wrote:
>>>
>>>>
>>>> Hi,
>>>>
>>>
>>> Hi,
>>>
>>
>> Hi again, and thanks for the quick reply!
>>
>>
>>>>
>>>> My (current) modprobe:
>>>>
>>>> options lnet networks=tcp0(eth0)10.4.21.50,tcp1(eth1)10.4.22.50
>>>>
>>>
>>> This syntax is incorrect.  For some examples of multi-homed
>>> configurations see the manual at
>>>
>>> http://manual.lustre.org/manual/LustreManual16_HTML/MoreComplicatedConfigurations.html#50642998_20213
>>>
>>
>> Yes that's the link i've been consulting, perhaps im not looking hard
>> enough.
>>
>>
>>>>
>>>> This is the errors i get:
>>>> LustreError: 10f-e: Error parsing
>>>> 'networks="tcp0(eth0)10.4.21.50,tcp1(eth1)10.4.22.50"'
>>>>
>>>
>>> When you specify "networks" because you specify the interfaces to use,
>>> you don't need to specify the ip address.  I think you are confusing the
>>> networks and ipnets options.
>>>
>>
>> The problem here exactly is that the physical interfaces is there, but
>> not with the ip-addresses i want the mdt to "listen" on - the "NIDs",
>> they are added later through heartbeat as aliases (IPaddr2::10.4.21.50
>> IPaddr2::10.4.22.50), but before mounting the mdt-resource (drbd).
>>
>>
>>>>
>>>> LustreError: 110-0: here...............................|---------|
>>>> LustreError: 4527:0:(events.c:707:ptlrpc_init_portals()) network
>>>> initialisation failed
>>>> (along with a bunch of errors since this module does not load)
>>>>      I've tried with tcp0(eth0:0) which fails with about the same error,
>>>> i've tried tcp0(eth0,eth1) which gives me the wrong addresses (machine
>>>> ones) but works.
>>>>
>>>
>>> What is the topology exactly?  Are there two nics or one nic with two
>>> addresses?  Are the two nics on the same physical network or separate
>>> physical networks?
>>>
>>
>> eth0 and eth1 are physical interfaces, they have statically assigned
>> ip's (for management, supervision etc), heartbeat then adds addresses
>> to theese two interfaces if the node is "primary".
>>
>> If it matters - eth0 and eth1 has separated physical paths to
>> everything, this is because we want to survive a physical fail on the
>> network before failing over to another physical server.
>>
>> As I read the manual, i format my OST's with more than one --mgsnode
>> option, which in turn will make the OST "know" about both path's to
>> the MDS/MGS server(s). As in, if first MGS does not work (physical
>> network failure on side A) - try second (Physical side B).
>>
>> What we healthcheck on is the data/disks/server hardware which will
>> tell heartbeat to fail over to server 2 which takes over network path
>> A and network path B (on 10.4.[21,22].50), and the OST's/clients
>> should continue working without noticing.
>>
>>
>>>
>>> b.
>>>
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>>
>>>
>>
>>
>>
>>
>
>



-- 
Timh Bergström
System Administrator
Diino AB - www.diino.com
:wq



More information about the lustre-discuss mailing list