[Lustre-discuss] MGS Nids
leen smit
leen at service2media.com
Fri May 21 02:57:22 PDT 2010
Ok. I started from scratch, using your kind replies as a guide line.
Yet, still no fail over when brining down the first MGS.
Below are the steps I've taken to setup, hopefully some one here can
spot my err.
I got rid of keepalived and drbd (was this wise? or should I keep this
for the MGS/MDT syncing?) and setup just Lustre.
Two nodes vor MGS/MDT, and two nodes for OSTs.
fs-mgs-001:~# mkfs.lustre --mgs --failnode=fs-mgs-002 at tcp --reformat
/dev/VG1/mgs
fs-mgs-001:~# mkfs.lustre --mdt --mgsnode=fs-mgs-001 at tcp
--failnode=fs-mgs-002 at tcp --fsname=datafs --reformat /dev/VG1/mdt
fs-mgs-001:~# mount -t lustre /dev/VG1/mgs /mnt/mgs/
fs-mgs-001:~# mount -t lustre /dev/VG1/mdt /mnt/mdt/
fs-mgs-002:~# mkfs.lustre --mgs --failnode=fs-mgs-001 at tcp --reformat
/dev/VG1/mgs
fs-mgs-002:~# mkfs.lustre --mdt --mgsnode=fs-mgs-001 at tcp
--failnode=fs-mgs-001 at tcp --fsname=datafs --reformat /dev/VG1/mdt
fs-mgs-002:~# mount -t lustre /dev/VG1/mgs /mnt/mgs/
fs-mgs-002:~# mount -t lustre /dev/VG1/mdt /mnt/mdt/
fs-ost-001:~# mkfs.lustre --ost --mgsnode=fs-mgs-001 at tcp
--mgsnode=fs-mgs-002 at tcp --failnode=fs-ost-002 at tcp --reformat
--fsname=datafs /dev/VG1/ost1
fs-ost-001:~# mount -t lustre /dev/VG1/ost1 /mnt/ost/
fs-ost-002:~# mkfs.lustre --ost --mgsnode=fs-mgs-001 at tcp
--mgsnode=fs-mgs-002 at tcp --failnode=fs-ost-001 at tcp --reformat
--fsname=datafs /dev/VG1/ost1
fs-ost-002:~# mount -t lustre /dev/VG1/ost1 /mnt/ost/
fs-mgs-001:~# lctl dl
0 UP mgs MGS MGS 7
1 UP mgc MGC192.168.21.33 at tcp 5b8fb365-ae8e-9742-f374-539d8876276f 5
2 UP mgc MGC127.0.1.1 at tcp 380bc932-eaf3-9955-7ff0-af96067a2487 5
3 UP mdt MDS MDS_uuid 3
4 UP lov datafs-mdtlov datafs-mdtlov_UUID 4
5 UP mds datafs-MDT0000 datafs-MDT0000_UUID 5
6 UP osc datafs-OST0000-osc datafs-mdtlov_UUID 5
7 UP osc datafs-OST0001-osc datafs-mdtlov_UUID 5
fs-mgs-001:~# lctl list_nids
192.168.21.32 at tcp
client:~# mount -t lustre 192.168.21.32 at tcp:192.168.21.33 at tcp:/datafs /data
client:~# time cp test.file /data/
real 0m47.793s
user 0m0.001s
sys 0m3.155s
So far, so good.
Lets try that again, now bringing down mgs-001
client:~# time cp test.file /data/
fs-mgs-001:~# umount /mnt/mdt && umount /mnt/mgs
fs-mgs-002:~# mount -t lustre /dev/VG1/mgs /mnt/mgs
fs-mgs-002:~# mount -t lustre /dev/VG1/mdt /mnt/mdt
fs-mgs-002:~# lctl dl
0 UP mgs MGS MGS 5
1 UP mgc MGC192.168.21.32 at tcp 82b34916-ed89-f5b9-026e-7f8e1370765f 5
2 UP mdt MDS MDS_uuid 3
3 UP lov datafs-mdtlov datafs-mdtlov_UUID 4
4 UP mds datafs-MDT0000 datafs-MDT0000_UUID 3
Missing the OSTs here, so I (try to..) remount these too
fs-ost-001:~# umount /mnt/ost/
fs-ost-001:~# mount -t lustre /dev/VG1/ost1 /mnt/ost/
mount.lustre: mount /dev/mapper/VG1-ost1 at /mnt/ost failed: No such
device or address
The target service failed to start (bad config log?)
(/dev/mapper/VG1-ost1). See /var/log/messages.
After this I can only get back to a running state by umounting
everything on the mgs-002, and remount on the mgs-001
What am I missing here?? Am I messing things up by creating two mgs, one
on each mgs node?
Leen
On 05/20/2010 03:40 PM, Gabriele Paciucci wrote:
> For a clearification in a two servers configuration:
>
> server1 -> 192.168.2.20 MGS+MDT+OST0
> server2 -> 192.168.2.22 OST1
> /dev/sdb is a lun shared between server1 and server 2
>
> from server1: mkfs.lustre --mgs --failnode=192.168.2.22 --reformat /dev/sdb1
> from server1: mkfs.lustre --reformat --mdt --mgsnode=192.168.2.20
> --fsname=prova --failover=192.168.2.22 /dev/sdb4
> from server1: mkfs.lustre --reformat --ost --mgsnode=192.168.2.20
> --failover=192.168.2.22 --fsname=prova /dev/sdb2
> from server2: mkfs.lustre --reformat --ost --mgsnode=192.168.2.20
> --failover=192.168.2.20 --fsname=prova /dev/sdb3
>
>
> from server1: mount -t lustre /dev/sdb1 /lustre/mgs_prova
> from server1: mount -t lustre /dev/sdb4 /lustre/mdt_prova
> from server1: mount -t lustre /dev/sdb2 /lustre/ost0_prova
> from server2: mount -t lustre /dev/sdb3 /lustre/ost1_prova
>
>
> from client:
> modprobe lustre
> mount -t lustre 192.168.2.20 at tcp:192.168.2.22 at tcp:/prova /prova
>
> now halt server1 and mount MGS, MDT and OST0 on server2, the client
> should continue the activity without problem
>
>
>
> On 05/20/2010 02:55 PM, Kevin Van Maren wrote:
>
>> leen smit wrote:
>>
>>
>>> Ok, no VIP's then.. But how does failover work in lustre then?
>>> If I setup everything using the real IP and then mount from a client and
>>> bring down the active MGS, the client will just sit there until it comes
>>> back up again.
>>> As in, there is no failover to the second node. So how does this
>>> internal lustre failover mechanism work?
>>>
>>> I've been going trought the docs, and I must say there is very little on
>>> the failover mechanism, apart from mentions that a seperate app should
>>> care of that. Thats the reason I'm implementing keepalived..
>>>
>>>
>>>
>> Right: the external service needs to keep the "mount" active/healthy on
>> one of the servers.
>> Lustre handles reconnecting clients/servers as long as the volume is
>> mounted where it expects
>> (ie, the mkfs node or the --failover node).
>>
>>
>>> At this stage I really am clueless, and can only think of creating a TUN
>>> interface, which will have the VIP address (thus, it becomes a real IP,
>>> not just a VIP).
>>> But I got a feeling that ain't the right approach either...
>>> Is there any docs available where a active/passive MGS setup is described?
>>> Is it sufficient to define a --failnode=nid,... at creation time?
>>>
>>>
>>>
>> Yep. See Johann's email on the MGS, but for the MDTs and OSTs that's
>> all you have to do
>> (besides listing both MGS NIDs at mkfs time).
>>
>> For the clients, you specify both MGS NIDs at mount time, so it can
>> mount regardless of which
>> node has the active MGS.
>>
>> Kevin
>>
>>
>>
>>> Any help would be greatly appreciated!
>>>
>>> Leen
>>>
>>>
>>> On 05/20/2010 01:45 PM, Brian J. Murrell wrote:
>>>
>>>
>>>
>>>> On Thu, 2010-05-20 at 12:46 +0200, leen smit wrote:
>>>>
>>>>
>>>>
>>>>
>>>>> Keepalive uses a VIP in a active/passive state. In a failover situation
>>>>> the VIP gets transferred to the passive one.
>>>>>
>>>>>
>>>>>
>>>>>
>>>> Don't use virtual IPs with Lustre. Lustre clients know how to deal with
>>>> failover nodes that have different IP addresses and using a virtual,
>>>> floating IP address will just confuse it.
>>>>
>>>> b.
>>>>
>>>>
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>>
>>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>>
>
>
More information about the lustre-discuss
mailing list