[lustre-discuss] lustre client not able to lctl ping or mount

Pak Lui pak.lui at linaro.org
Tue Sep 4 08:06:09 PDT 2018

Hi all,

I am having issue with the Lustre client pinging the server using o2ib.I
want to find out if anyone has a suggestion on what could be the problem.
Thanks in advance.

lustre client pinging to server:

[root at n0 ~]# lctl ping at o2ib
failed to ping at o2ib: Input/output error <<<<<<<

lustre client pinging to server over IPoIB works:

[root at n0~]# ping -c 1
PING ( 56(84) bytes of data.
64 bytes from icmp_seq=1 ttl=64 time=0.376 ms

lustre client pinging to self or other client works:

[root at n0 ~]# lctl ping at o2ib
12345-0 at lo
12345- at o2ib

lustre client pinging to self or otover IPoIB works:

[root at n0~]# ping -c 1
PING ( 56(84) bytes of data.
64 bytes from icmp_seq=1 ttl=64 time=0.017 ms

The lustre server and client have specified the modprobe for lnet:

options lnet networks=o2ib(ib0)

The client reports some error when trying to ping or mount from the client
to server:
modprobe lustre lnet
lctl ping at o2ib
mount -v -t lustre at o2ib:/zfs /mnt/zfs

[root at n0 ~]# dmesg|tail
[589805.093447] Lustre: Lustre: Build Version: 2.11.54
[589805.272652] LNet: Using FastReg for registration
[589805.275954] LNet: Added LNI at o2ib [8/256/0/180]
[589813.278370] LNet: 22357:0:(o2iblnd_cb.c:3320:kiblnd_check_conns())
Timed out tx for at o2ib: 589813 seconds
[589835.518404] LustreError:
22463:0:(mgc_request.c:251:do_config_log_add()) MGC192.168.13.8 at o2ib:
failed processing log, type 1: rc = -5
[589843.118385] LustreError: 22488:0:(mgc_request.c:601:do_requeue())
failed processing log: -5
[589866.718389] LustreError: 15c-8: MGC192.168.13.8 at o2ib: The configuration
from log 'zfs-client' failed (-5). This may be the result of communication
errors between this node and the MGS, a bad configuration, or other errors.
See the syslog for more information.
[589866.741623] Lustre: Unmounted zfs-client
[589867.278516] LustreError: 22463:0:(obd_mount.c:1599:lustre_fill_super())
Unable to mount  (-5)

server reports some error during mounting:

[root at license ~]# Sep  4 07:26:56 license kernel: LNet:
25518:0:(o2iblnd_cb.c:2475:kiblnd_passive_connect()) Can't accept conn from at o2ib (version 12): max_frags 16 incompatible without FMR pool
(256 wanted)

The lustre server setup:

[root at license ~]# lfs df -h
UUID                       bytes        Used   Available Use% Mounted on
zfs-MDT0000_UUID          863.4M        7.5M      853.9M   1%
zfs-OST0000_UUID            1.7T       10.0G        1.7T   1%

filesystem_summary:         1.7T       10.0G        1.7T   1% /mnt/zfs

server: RHEL 7.5 (3.10.0-862.el7.x86_64), MLNX_OFED_LINUX-4.4-,
lustre 2.11.54
client: RHEL 7.5 (4.14.0-49.el7a.aarch64), MLNX_OFED_LINUX-4.4- ,
lustre 2.11.54

- Pak
