[lustre-discuss] Lustre 2.10.0 multi rail configuration

Gmitter, Joseph joseph.gmitter at intel.com
Tue Aug 29 06:41:16 PDT 2017


While this is not a direct solution to your issue, all should know that MR documentation can also be found in the Lustre manual at http://doc.lustre.org/lustre_manual.xhtml#lnetmr


From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of Riccardo Veraldi <Riccardo.Veraldi at cnaf.infn.it>
Date: Monday, August 28, 2017 at 10:08 PM
To: Chris Horn <hornc at cray.com>, "lustre-discuss at lists.lustre.org" <lustre-discuss at lists.lustre.org>
Subject: Re: [lustre-discuss] Lustre 2.10.0 multi rail configuration

I tried to follow this Intel document

https://www.eofs.eu/_media/events/lad16/12_multirail_lnet_for_lustre_weber.pdf

anyways lnetctl fails with the error:

add:
       - ip2nets:
                 errno: -22
                 descr: "cannot add network: Invalid argument"

and this is my lnet.conf

ip2nets:
  - net-spec: o2ib5
     interfaces:
         0: ib0[0]
         1: ib1[1]


here my lustre.conf

options lnet networks=o2ib5,tcp5(enp1s0f0)

thanks


Rick



On 8/28/17 4:07 PM, Chris Horn wrote:
Dynamic LNet configuration (DLC) must be used to configure multi-rail. Lustre 2.10 contains an “lnet.conf” file that has a sample multi-rail configuration. I’ve copied it below for your convenience.

> # lnet.conf - configuration file for lnet routes to be imported by lnetctl
> #
> # This configuration file is formatted as YAML and can be imported
> # by lnetctl.
> #
> # net:
> #     - net type: o2ib1
> #       local NI(s):
> #         - nid: 172.16.1.4 at o2ib1
> #           interfaces:
> #               0: ib0
> #           tunables:
> #               peer_timeout: 180
> #               peer_credits: 128
> #               peer_buffer_credits: 0
> #               credits: 1024
> #           lnd tunables:
> #               peercredits_hiw: 64
> #               map_on_demand: 32
> #               concurrent_sends: 256
> #               fmr_pool_size: 2048
> #               fmr_flush_trigger: 512
> #               fmr_cache: 1
> #           CPT: "[0,1]"
> #         - nid: 172.16.2.4 at o2ib1
> #           interfaces:
> #               0: ib1
> #           tunables:
> #               peer_timeout: 180
> #               peer_credits: 128
> #               peer_buffer_credits: 0
> #               credits: 1024
> #           lnd tunables:
> #               peercredits_hiw: 64
> #               map_on_demand: 32
> #               concurrent_sends: 256
> #               fmr_pool_size: 2048
> #               fmr_flush_trigger: 512
> #               fmr_cache: 1
> #           CPT: "[0,1]"
> # route:
> #     - net: o2ib
> #       gateway: 172.16.1.1 at o2ib1
> #       hop: -1
> #       priority: 0
> # peer:
> #     - primary nid: 192.168.1.2 at o2ib
> #       Multi-Rail: True
> #       peer ni:
> #         - nid: 192.168.1.2 at o2ib
> #         - nid: 192.168.2.2 at o2ib
> #     - primary nid: 172.16.1.1 at o2ib1
> #       Multi-Rail: True
> #       peer ni:
> #         - nid: 172.16.1.1 at o2ib1
> #         - nid: 172.16.2.1 at o2ib1<mailto:172.16.2.1 at o2ib1>

Chris Horn

From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org><mailto:lustre-discuss-bounces at lists.lustre.org> on behalf of Riccardo Veraldi <Riccardo.Veraldi at cnaf.infn.it><mailto:Riccardo.Veraldi at cnaf.infn.it>
Date: Monday, August 28, 2017 at 5:49 PM
To: "lustre-discuss at lists.lustre.org"<mailto:lustre-discuss at lists.lustre.org> <lustre-discuss at lists.lustre.org><mailto:lustre-discuss at lists.lustre.org>
Subject: [lustre-discuss] Lustre 2.10.0 multi rail configuration

Hello,
I am trying to deploy a multi rail configuration on Lustre 2.10.0 on RHEL73.
My goal is to use both the IB interfaces on OSSes and client.
I have one client and two OSSes and 1 MDS
My LNet network is labelled o2ib5 and tcp5 just for my own convenience. What I did is to modify the configuration of lustre.conf

options lnet networks=o2ib5(ib0,ib1),tcp5(enp1s0f0)

lctl list_nids on either hte OSSes or the client shows me both local IB interfaces:

172.21.52.86 at o2ib5
172.21.52.118 at o2ib5
172.21.42.211 at tcp5

anyway I can't run a LNet selftest using the new nids, it fails.
Seems like they are unused.
Any hint on the multi-rail configuration needed?
What I'd like to do is use both InfiniBand cards (ib0,ib1)  on my two OSSes and on my client to leverage more bandwidth usage
since with only one InfiniBand I cannot saturate the disk performance.
thank you




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20170829/7ac0719a/attachment-0001.htm>


More information about the lustre-discuss mailing list