[lustre-discuss] status: down for network interface

Ulrich Sibiller ulrich.sibiller at eviden.com
Wed Jun 26 07:43:00 PDT 2024


Hello,

on one of our MDS (MDS1) with Lustre 2.15.4 we see one NI in status "down":

[ mds1 ]# lnetctl net show
net:
    - net type: lo
      local NI(s):
        - nid: 0 at lo
          status: up
    - net type: o2ib
      local NI(s):
        - nid: AA.BB.CC.34 at o2ib
          status: down <-----------------
          interfaces:
              0: ib0
        - nid: AA.BB.CC.35 at o2ib
          status: up
          interfaces:
              0: ib1
    - net type: tcp
      local NI(s):
        - nid: DD.EE.FF.42 at tcp
          status: up
          interfaces:
              0: bond0

However, we are not really sure what this means, as the interface seems to be ok on the InfiniBand and the kernel side (see output at the end of the mail). Running lnetctl net show -v or lnetctl export multiple times with some pauses in between shows the send and recv counters increasing (for both ib0 and ib1):

    - net type: o2ib
      local NI(s):
        - nid: AA.BB.CC.34 at o2ib
          status: down
          interfaces:
              0: ib0
          statistics:
              send_count: 286859369
              recv_count: 291921704
              drop_count: 1969
<20s pause>
    - net type: o2ib
      local NI(s):
        - nid: AA.BB.CC.34 at o2ib
          status: down
          interfaces:
              0: ib0
          statistics:
              send_count: 286861252
              recv_count: 291923587
              drop_count: 1969


So the interface seems to be in use!

All this leads to the following questions:
- What does "down" mean here? What are the consequences?
- What could be the reason?
- What can we do to examine this further?
- How can we change the interface to status up?

I can provide further information if required.


Here are the InfiniBand and kernel stats of the interface on MDS1:

[ mds1 ]# ibdev2netdev 
mlx5_0 port 1 ==> ib0 (Up)
...
[ mds1 ]# ibstatus
Infiniband device 'mlx5_0' port 1 status:
        default gid:     <censored>
        base lid:        0x21
        sm lid:          0x2
        state:           4: ACTIVE
        phys state:      5: LinkUp
        rate:            200 Gb/sec (4X HDR)
        link_layer:      InfiniBand
...


[ mds1 ]# ibportstate -L 33 1 
CA/RT PortInfo:
# Port info: Lid 33 port 1
LinkState:.......................Active
PhysLinkState:...................LinkUp
Lid:.............................33
SMLid:...........................2
LMC:.............................0
LinkWidthSupported:..............1X or 4X or 2X
LinkWidthEnabled:................1X or 4X or 2X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedActive:.................10.0 Gbps
LinkSpeedExtSupported:...........14.0625 Gbps or 25.78125 Gbps or 53.125 Gbps
LinkSpeedExtEnabled:.............14.0625 Gbps or 25.78125 Gbps or 53.125 Gbps
LinkSpeedExtActive:..............53.125 Gbps
Mkey:............................<not displayed>
MkeyLeasePeriod:.................0
ProtectBits:.....................0
# MLNX ext Port info: Lid 33 port 1
StateChangeEnable:...............0x00
LinkSpeedSupported:..............0x00
LinkSpeedEnabled:................0x00
LinkSpeedActive:.................0x00

Same for the IPoIB interface:
[ mds1 ]# ip a s ib0
5: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state UP group default qlen 256
    link/infiniband <censored> brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
    inet AA.BB.CC.34/21 brd AA.BB.CC.255 scope global noprefixroute ib0
       valid_lft forever preferred_lft forever




MfG/Kind regards,

Ulrich Sibiller

-- 
Dipl.-Inf. Ulrich Sibiller
Senior IT Consultant
eviden.com

an atos business


science+computing ag
Management Board: Dr. Martin Matzke (Chairman), Sabine Hohenstein, Matthias Schempp; Chairman of the Supervisory Board: Emmanuel Le Roux; Registered office: Tübingen; Commercial register of the local court of Stuttgart, HRB 382196






More information about the lustre-discuss mailing list