[lustre-discuss] LNET router (2.10.0) recommendations for heterogeneous (mlx5, qib) IB setup

Nathan R.M. Crawford nrcrawfo at uci.edu
Tue Jul 25 11:52:43 PDT 2017


Hi All,

  We are gradually updating a cluster (OS, etc.) in-place, basically
switching blocks of nodes from the old head node to the new. Until we can
re-arrange the fabric at the next scheduled machine room power shutdown
event, we are running two independent Infiniband subnets. As I can't find
useful documentation on proper IB routing between subnets, I have
configured one node with an HCA on each IB subnet that does simple IPoIB
routing and LNET routing.

Brief description:
  Router node has 24 cores, 128GB RAM, and is running with the in-kernel IB
drivers from Centos7.3. It connects to the new IB fabric via a Mellanox EDR
card (MT4115) on ib0, and to the old via a Truescale QDR card (QLE7340) on
ib1. The old IB is on 10.2.0.0/16 (o2ib0), and the new is 10.201.32.0/19
(o2ib1).

  The new 2.10.0 server is on the EDR side, and the old 2.8.0 server is on
the QDR side. Nodes with QDR HCAs already coexist with EDR nodes on the EDR
subnet without problems.

All Lustre config via /etc/lnet.conf:
#####
net:
    - net type: o2ib1
      local NI(s):
        - nid: 10.201.32.11 at o2ib1
          interfaces:
              0: ib0
          tunables:
              peer_timeout: 180
              peer_credits: 62
              peer_buffer_credits: 512
              credits: 1024
          lnd tunables:
              peercredits_hiw: 64
              map_on_demand: 256
              concurrent_sends: 62
              fmr_pool_size: 2048
              fmr_flush_trigger: 512
              fmr_cache: 1
              ntx: 2048
    - net type: o2ib0
      local NI(s):
        - nid: 10.2.1.22 at o2ib0
          interfaces:
              0: ib1
          tunables:
              peer_timeout: 180
              peer_credits: 8
              peer_buffer_credits: 512
              credits: 1024
          lnd tunables:
              map_on_demand: 32
              concurrent_sends: 16
              fmr_pool_size: 2048
              fmr_flush_trigger: 512
              fmr_cache: 1
              ntx: 2048
routing:
    - small: 16384
      large: 2048
      enable: 1
####

  While the setup works, I had to drop peer_credits to 8 on the QDR side to
avoid long periods of stalled traffic. It is probably going to be adequate
for the remaining month before total shutdown and removal of routers, but I
would still like to have a better solution in hand.

Questions:
1) Is there a well-known good config for a qib<-->mlx5 LNET router?
2) Where should I look to identify the cause of stalled traffic, which
still appears at higher load?
3) What parameters should I be playing with to optimize the router?

Thanks,
Nate



-- 

Dr. Nathan Crawford              nathan.crawford at uci.edu
Modeling Facility Director
Department of Chemistry
1102 Natural Sciences II         Office: 2101 Natural Sciences II
University of California, Irvine  Phone: 949-824-4508
Irvine, CA 92697-2025, USA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20170725/8727c755/attachment.htm>


More information about the lustre-discuss mailing list