[lustre-discuss] lnetctl fails to recreate exact config when importing exported lnet.conf

Angelos Ching angelosching at clustertech.com
Fri Sep 4 03:35:55 PDT 2020


Hi Aurélien,

May I have some pointers on to whom my account request for the Jira should be sent?

Thanks,
Angelos
(Sent from mobile, please pardon me for typos and cursoriness.)

> 2020/09/04 16:01、Degremont, Aurelien <degremoa at amazon.com>のメール:
> 
> Hi Angelos,
> 
> Bug reports could be made at  https://jira.whamcloud.com/
> 
> 
> Aurélien
> 
> Le 04/09/2020 06:11, « lustre-discuss au nom de Angelos Ching » <lustre-discuss-bounces at lists.lustre.org au nom de angelosching at clustertech.com> a écrit :
> 
>    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
>    Dear all,
> 
>    I think I've encountered a bug in lnetctl but not sure where to submit a
>    bug report:
> 
>    Summary:
>    It's expected that the Lnet config on a node can be recreated on
>    lnet.service start up by saving the config using: lnetctl export
>    --backup > /etc/lnet.conf
>    But ordering within ymal file causes extraneous NIDs to be created when
>    used in combination with routing, thus breaking Lnet routing / node
>    communication, with server side dmesg showing "Bad dest nid n.n.n.n at o2ib
>    (it's my nid but on a different network)"
> 
>    Environment:
>    Client: CentOS 7.8, Lustre 2.12.5-ib, MLNX OFED 4.9-0.1.7.1
>    Lnet router + server: CentOS 7.7, Lustre 2.12.4-ib, MLNX OFED 4.7-3.2.9.0
> 
>    Steps to reproduce:
>    (Listing 1) Server side Lnet config (peer list omitted for conciseness):
>    https://pastebin.com/DH6HAt5a
>    (Listing 2) Full command listing and output on client side is reproduced
>    here: https://pastebin.com/h3wHyCM7
> 
>    All steps below carried out on Lustre client:
> 
>    1. Restart lnet service with empty /etc/lnet.conf
>    2. lnetctl net add: TCP network using Ethernet
>    3. lnetctl peer add: 2 peers with "Lnet router + server"@o2ib,tcp NIDs
>    4. lnetctl route add: 2 gateways to o2ib network using "Lnet router +
>    server"@TCP NID
>    5. lnetctl export: with --backup to /etc/lnet.conf; check the saved file
>    and confirm Lnet is configured with 2 peers and 2 gateways (Listing 2:
>    37-47)
>    6. Mount o2ib exported Lustre volume and confirm volume functioning
>    correctly; unmount volume
>    7. Restart lnet.service and check lnet configuration; finds 2 extra peer
>    entries that reference only TCP NID of the "Lnet router + server" along
>    with 2 manually configured peers that reference both o2ib and tcp NIDs
>    (Listing 2: 75-93)
>    8. Client fails to mount o2ib exported volume; server side kernel
>    message shows "Bad dest nid n.n.n.n at o2ib (it's my nid but on a different
>    network)"
> 
>    9. If we reorder the peer list to go before the route list in
>    /etc/lnet.conf (Listing 2: 16), then lnet would be properly configured
>    with 2 peers on service restart and everything works as expected.
> 
>    Best regards,
> 
>    --
>    Angelos Ching
>    ClusterTech Limited
> 
>    Tel     : +852-2655-6138
>    Fax     : +852-2994-2101
>    Address : Unit 211-213, Lakeside 1, 8 Science Park West Ave., Shatin, Hong Kong
> 
>    Got praises or room for improvements? http://bit.ly/TellAngelos
> 
>    ********************************************************************************
>    The information contained in this e-mail and its attachments is confidential and
>    intended solely for the specified addressees. If you have received this email in
>    error, please do not read, copy, distribute, disclose or use any information of
>    this email in any way and please immediately notify the sender and delete this
>    email. Thank you for your cooperation.
>    ********************************************************************************
> 
>    _______________________________________________
>    lustre-discuss mailing list
>    lustre-discuss at lists.lustre.org
>    http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 



More information about the lustre-discuss mailing list