[lustre-discuss] {Disarmed} Re: [EXTERNAL] Re: need to always manually add network after reboot

Angelos Ching angelosching at clustertech.com
Wed Feb 24 07:06:40 PST 2021


There's another thing that I had wanted to submit a bug report for but have gotten to it, yet… 

lnetctl exports and imports in lexical order (or Python dictionary order to be exact), but some of the settings would fail if imported not according to specific order, eg: network needs to be setup before a router can be added, but lnetctl import would try setting up a router before before the network is up and fails or something similar.

I never got around after working around with a manual lnet.conf since it is pretty static…

Cheers,
Angelos
(Sent from mobile, please pardon me for typos and cursoriness.)

> 24/2/2021 19:43、Laura Hild via lustre-discuss <lustre-discuss at lists.lustre.org>のメール:
> 
> 
> > Or you can manually build lnet.conf as lnetctl seems to have occasion
> > problems with some of the fields exported by "lnetctl export --backup"
> 
> I've noticed, in particular,
> 
>   LNetError: 122666:0:(peer.c:372:lnet_peer_ni_del_locked())
>   Peer NI x.x.x.x at tcp is a gateway. Can not delete it
> 
> and
> 
>   errno: -2
>   descr: "cannot add peer ni: No such file or directory"
> 
> not having removed the peer:​ section.
> 
> -Laura
> 
> 
> Od: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> v imenu Angelos Ching via lustre-discuss <lustre-discuss at lists.lustre.org>
> Poslano: torek, 23. februar 2021 05:06
> Za: lustre-discuss at lists.lustre.org <lustre-discuss at lists.lustre.org>
> Zadeva: [EXTERNAL] Re: [lustre-discuss] need to always manually add network after reboot
>  
> Hi Sid,
> Notice that you are using lnetctl net add to add the lnet network, which means you should be using a recent version of Lustre that depends on /etc/lnet.conf for boot time lnet configuration.
> You can save the current lnet configuration using command: lnetctl export --backup > /etc/lnet.conf (make a backup of the original file first if required)
> On next boot, lnet.service will load your lnet configuration from the file.
> Or you can manually build lnet.conf as lnetctl seems to have occasion problems with some of the fields exported by "lnetctl export --backup"
> Attaching my simple lnet.conf for your reference:
>> # cat /etc/lnet.conf
>> ip2nets:
>>   - net-spec: o2ib
>>     ip-range:
>>       0: 10.2.8.*
>>   - net-spec: tcp
>>     ip-range:
>>       0: 10.5.9.*
>> route:
>>     - net: o2ib
>>       gateway: 10.5.9.25 at tcp
>>       hop: -1
>>       priority: 0
>>     - net: o2ib
>>       gateway: 10.5.9.24 at tcp
>>       hop: -1
>>       priority: 0
>> global:
>>     numa_range: 0
>>     max_intf: 200
>>     discovery: 1
>>     drop_asym_route: 0
> Best regards,
> Angelos
> On 23/02/2021 13:58, Sid Young via lustre-discuss wrote:
>> 
>> G'Day all,
>> I'm finding that when I reboot any node in our new HPC, I need to keep manually adding the network using lnetctl net add --net tcp --if ens2f0
>> Then I can do an lnetctl net show and see the tcp part active...
>> 
>> I have options in  /etc/modprobe.d/lnet.conf
>> options lnet networks=tcp
>> 
>> and 
>> 
>> [root at hpc-oss-03 ~]# cat /etc/modprobe.d/lustre.conf
>> options lnet networks="tcp(ens2f0)"
>> options lnet ip2nets="tcp(ens2f0) 10.140.93.*
>> 
>> I've read the doco and tried to understand the correct parameters for a simple Lustre config so this is what I worked out is needed... but I suspect its still wrong.
>> 
>> Any help appreciated :)
>> 
>> 
>> 
>> Sid Young
>> 
>> 
>> 
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> MailScanner has detected a possible fraud attempt from "urldefense.proofpoint.com" claiming to be http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> -- 
> Angelos Ching
> ClusterTech Limited
> 
> Tel     : +852-2655-6138
> Fax     : +852-2994-2101
> Address	: Unit 211-213, Lakeside 1, 8 Science Park West Ave., Shatin, Hong Kong
> 
> Got praises or room for improvements? MailScanner has detected a possible fraud attempt from "urldefense.proofpoint.com" claiming to be http://bit.ly/TellAngelos
> 
> ********************************************************************************
> The information contained in this e-mail and its attachments is confidential and
> intended solely for the specified addressees. If you have received this email in
> error, please do not read, copy, distribute, disclose or use any information of
> this email in any way and please immediately notify the sender and delete this 
> email. Thank you for your cooperation.
> ********************************************************************************
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20210224/aba240f6/attachment-0001.html>


More information about the lustre-discuss mailing list