[lustre-discuss] Lustre 2.10.0 multi rail configuration

Riccardo Veraldi Riccardo.Veraldi at cnaf.infn.it
Mon Aug 28 19:08:06 PDT 2017


I tried to follow this Intel document

https://www.eofs.eu/_media/events/lad16/12_multirail_lnet_for_lustre_weber.pdf

anyways lnetctl fails with the error:

add:
       - ip2nets:
                 errno: -22
                 descr: "cannot add network: Invalid argument"

and this is my lnet.conf

ip2nets:
  - net-spec: o2ib5
     interfaces:
         0: ib0[0]
         1: ib1[1]


here my lustre.conf

options lnet networks=o2ib5,tcp5(enp1s0f0)

thanks


Rick



On 8/28/17 4:07 PM, Chris Horn wrote:
>
> Dynamic LNet configuration (DLC) must be used to configure multi-rail.
> Lustre 2.10 contains an “lnet.conf” file that has a sample multi-rail
> configuration. I’ve copied it below for your convenience.
>
>  
>
> > # lnet.conf - configuration file for lnet routes to be imported by lnetctl
>
> > #
>
> > # This configuration file is formatted as YAML and can be imported
>
> > # by lnetctl.
>
> > #
>
> > # net:
>
> > #     - net type: o2ib1
>
> > #       local NI(s):
>
> > #         - nid: 172.16.1.4 at o2ib1
>
> > #           interfaces:
>
> > #               0: ib0
>
> > #           tunables:
>
> > #               peer_timeout: 180
>
> > #               peer_credits: 128
>
> > #               peer_buffer_credits: 0
>
> > #               credits: 1024
>
> > #           lnd tunables:
>
> > #               peercredits_hiw: 64
>
> > #               map_on_demand: 32
>
> > #               concurrent_sends: 256
>
> > #               fmr_pool_size: 2048
>
> > #               fmr_flush_trigger: 512
>
> > #               fmr_cache: 1
>
> > #           CPT: "[0,1]"
>
> > #         - nid: 172.16.2.4 at o2ib1
>
> > #           interfaces:
>
> > #               0: ib1
>
> > #           tunables:
>
> > #               peer_timeout: 180
>
> > #               peer_credits: 128
>
> > #               peer_buffer_credits: 0
>
> > #               credits: 1024
>
> > #           lnd tunables:
>
> > #               peercredits_hiw: 64
>
> > #               map_on_demand: 32
>
> > #               concurrent_sends: 256
>
> > #               fmr_pool_size: 2048
>
> > #               fmr_flush_trigger: 512
>
> > #               fmr_cache: 1
>
> > #           CPT: "[0,1]"
>
> > # route:
>
> > #     - net: o2ib
>
> > #       gateway: 172.16.1.1 at o2ib1
>
> > #       hop: -1
>
> > #       priority: 0
>
> > # peer:
>
> > #     - primary nid: 192.168.1.2 at o2ib
>
> > #       Multi-Rail: True
>
> > #       peer ni:
>
> > #         - nid: 192.168.1.2 at o2ib
>
> > #         - nid: 192.168.2.2 at o2ib
>
> > #     - primary nid: 172.16.1.1 at o2ib1
>
> > #       Multi-Rail: True
>
> > #       peer ni:
>
> > #         - nid: 172.16.1.1 at o2ib1
>
> > #         - nid: 172.16.2.1 at o2ib1 <mailto:172.16.2.1 at o2ib1>
>
>  
>
> Chris Horn
>
>  
>
> *From: *lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on
> behalf of Riccardo Veraldi <Riccardo.Veraldi at cnaf.infn.it>
> *Date: *Monday, August 28, 2017 at 5:49 PM
> *To: *"lustre-discuss at lists.lustre.org" <lustre-discuss at lists.lustre.org>
> *Subject: *[lustre-discuss] Lustre 2.10.0 multi rail configuration
>
>  
>
> Hello,
> I am trying to deploy a multi rail configuration on Lustre 2.10.0 on
> RHEL73.
> My goal is to use both the IB interfaces on OSSes and client.
> I have one client and two OSSes and 1 MDS
> My LNet network is labelled o2ib5 and tcp5 just for my own
> convenience. What I did is to modify the configuration of lustre.conf
>
> options lnet networks=o2ib5(ib0,ib1),tcp5(enp1s0f0)
>
> lctl list_nids on either hte OSSes or the client shows me both local
> IB interfaces:
>
> *172.21.52.86 at o2ib5
> 172.21.52.118 at o2ib5*
> 172.21.42.211 at tcp5
>
> anyway I can't run a LNet selftest using the new nids, it fails.
>
> Seems like they are unused.
> Any hint on the multi-rail configuration needed?
> What I'd like to do is use both InfiniBand cards (ib0,ib1)  on my two
> OSSes and on my client to leverage more bandwidth usage
> since with only one InfiniBand I cannot saturate the disk performance.
> thank you
>
>  
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20170828/700915c0/attachment.htm>


More information about the lustre-discuss mailing list