[Lustre-discuss] Failover for MGS
Robert LeBlanc
robert at leblancnet.us
Mon Nov 12 13:36:55 PST 2007
You should just unmount all the clients, all OSTs and then:
tunefs.lustre failnode 10.0.0.2 at tcp writeconf /dev/shared/disk
If your volume is already on the shared disk, them mount everything and you
should be good to go. You can also do it on a live mounted system by using
lctl, but I¹m not exactly sure how to do that.
Robert
On 11/12/07 2:24 PM, "Wojciech Turek" <wjt27 at cam.ac.uk> wrote:
> Hi,
>
> How will look my tunefs.lustre command line if I would like to configure
> failnode for my MDS. I have two MDT's and MGS is on the same block device that
> one of MDT's ? I have also two servers connected to share matadata storage.
>
> Thanks,
>
> Wojciech
> On 12 Nov 2007, at 20:49, Nathan Rutman wrote:
>
>> Robert LeBlanc wrote:
>>
>>> Ok, I feel really stupid. I've done this before without any problem, but I
>>> can't seem to get it to work and I can't find my notes from the last time I
>>> did it. We have separate MGS and MDTs. I can't seem to get our MGS to
>>> failover correctly after reformatting it.
>>>
>>> mkfs.lustre --mkfsoptions="-O dir_index" --reformat --mgs
>>> --failnode=192.168.1.253 at o2ib /dev/mapper/ldiskc-part1
>>>
>>>
>>>
>> The MGS doesn't actually use the --failnode option (although it won't
>> hurt). You actually have to tell the other nodes
>> in the system (servers and clients) about the failover options for the
>> MGS (use the --mgsnode parameter on servers, and mount address for
>> clients). The reason is because the servers must contact the MGS for
>> the configuration information, and they can't ask the MGS where its
>> failover partner is if e.g. the failover partner is the one that's running.
>>
>>
>>> We are running this on Debian, using the Lustre 1.6.3 debs from svn on Lenny
>>> with 2.6.22.12. I've tried several permutations of the mkfs.lustre command,
>>> specifing both nodes as failover, and both nodes as MGS and pretty much
>>> every other combination of the above. With the above command tunefs.lustre
>>> shows that failnode and mgsnode are the failover node.
>>>
>>> Thanks,
>>> Robert
>>>
>>> Robert LeBlanc
>>> College of Life Sciences Computer Support
>>> Brigham Young University
>>> leblanc at byu.edu
>>> (801)422-1882
>>>
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at clusterfs.com
>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>>>
>>>
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at clusterfs.com
>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>>>
>>>
>>>
>>> Mr Wojciech Turek
>>> Assistant System Manager
>>> University of Cambridge
>>> High Performance Computing service
>>> email: wjt27 at cam.ac.uk
>>> tel. +441223763517
>>>
>>>
>>>
>>>
>>>
Robert LeBlanc
College of Life Sciences Computer Support
Brigham Young University
leblanc at byu.edu
(801)422-1882
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20071112/0012ec37/attachment.htm>
More information about the lustre-discuss
mailing list