[Lustre-discuss] Failover for MGS

Robert LeBlanc robert at leblancnet.us
Mon Nov 12 13:36:55 PST 2007


You should just unmount all the clients, all OSTs and then:

tunefs.lustre ‹failnode 10.0.0.2 at tcp ‹writeconf /dev/shared/disk

If your volume is already on the shared disk, them mount everything and you
should be good to go. You can also do it on a live mounted system by using
lctl, but I¹m not exactly sure how to do that.

Robert

On 11/12/07 2:24 PM, "Wojciech Turek" <wjt27 at cam.ac.uk> wrote:

> Hi,
> 
> How will look my tunefs.lustre command line if I would like to configure
> failnode for my MDS. I have two MDT's and MGS is on the same block device that
> one of MDT's ? I have also two servers connected to share matadata storage.
> 
> Thanks,
> 
> Wojciech 
> On 12 Nov 2007, at 20:49, Nathan Rutman wrote:
> 
>> Robert LeBlanc wrote:
>>  
>>> Ok, I feel really stupid. I've done this before without any problem, but I
>>> can't seem to get it to work and I can't find my notes from the last time I
>>> did it. We have separate MGS and MDTs. I can't seem to get our MGS to
>>> failover correctly after reformatting it.
>>> 
>>> mkfs.lustre --mkfsoptions="-O dir_index" --reformat --mgs
>>> --failnode=192.168.1.253 at o2ib /dev/mapper/ldiskc-part1
>>> 
>>> 
>>>  
>> The MGS doesn't actually use the --failnode option (although it won't 
>> hurt).  You actually have to tell the other nodes
>> in the system (servers and clients) about the failover options for the 
>> MGS (use the --mgsnode parameter on servers, and mount address for 
>> clients).   The reason is because the servers must contact the MGS for 
>> the configuration information, and they can't ask the MGS where its 
>> failover partner is if e.g. the failover partner is the one that's running.
>> 
>>  
>>> We are running this on Debian, using the Lustre 1.6.3 debs from svn on Lenny
>>> with 2.6.22.12. I've tried several permutations of the mkfs.lustre command,
>>> specifing both nodes as failover, and both nodes as MGS and pretty much
>>> every other combination of the above. With the above command tunefs.lustre
>>> shows that failnode and mgsnode are the failover node.
>>> 
>>> Thanks,
>>> Robert
>>> 
>>> Robert LeBlanc
>>> College of Life Sciences Computer Support
>>> Brigham Young University
>>> leblanc at byu.edu
>>> (801)422-1882
>>> 
>>> 
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at clusterfs.com
>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>>> 
>>>  
>>> 
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at clusterfs.com
>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>>>  
>>> 
>>>  
>>> Mr Wojciech Turek
>>> Assistant System Manager
>>> University of Cambridge
>>> High Performance Computing service 
>>> email: wjt27 at cam.ac.uk
>>> tel. +441223763517
>>> 
>>> 
>>>  
>>> 
>>> 

 
Robert LeBlanc
College of Life Sciences Computer Support
Brigham Young University
leblanc at byu.edu
(801)422-1882


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20071112/0012ec37/attachment.htm>


More information about the lustre-discuss mailing list