[lustre-discuss] 2.15.4 o2iblnd on RoCEv2?
Jeff Johnson
jeff.johnson at aeoncomputing.com
Tue Jan 9 19:45:16 PST 2024
Howdy intrepid Lustrefarians,
While starting down the debug rabbit hole I thought I'd raise my hand
and see if anyone has a few magic beans to spare.
I cannot get lnet (via lnetctl) to init a o2iblnd interface on a
RoCEv2 interface.
Running `lnetctl net add --net ib0 --if enp1s0np0` results in
net:
errno: -1
descr: cannot parse net '<255:65535>'
Nothing in dmesg to indicate why. Search engines aren't coughing up
much here either.
Env: Rocky 8.9 x86_64, MOFED 5.8-4.1.5.0, Lustre 2.15.4
I'm able to run mpi over the RoCEv2 interface. Utils like ibstatus and
ibdev2netdev report it correctly. ibv_rc_pingpong works fine between
nodes.
Configuring as socklnd works fine. `lnetctl net add --net tcp0 --if
enp1s0np0 && lnetctl net show`
[root at r2u11n3 ~]# lnetctl net show
net:
- net type: lo
local NI(s):
- nid: 0 at lo
status: up
- net type: tcp
local NI(s):
- nid: 10.0.50.27 at tcp
status: up
interfaces:
0: enp1s0np0
I verified the RoCEv2 interface using nVidia's `cma_roce_mode` as well
as sysfs references
[root at r2u11n3 ~]# cma_roce_mode -d mlx5_0 -p 1
RoCE v2
Ideas? Suggestions? Incense?
Thanks,
--Jeff
More information about the lustre-discuss
mailing list