[lustre-discuss] Lustre mds/ods Server with IB/omnipath and Ethernet clients (dual homed?)
Horn, Chris
chris.horn at hpe.com
Thu Nov 30 08:29:46 PST 2023
Right, when you format a Lustre target, it registers itself with the MGS. Part of that registration is telling the MGS what NIDs the target can be reached at (the MGS, in turn, passes this information to the clients). If you add or delete NIDs then you need to ensure that information is updated with the MGS. This is the procedure I linked in the Ops manual.
lctl list_nids does not tell you which NIDs are registered with the MGS. It only tells you what NIDs are currently defined on the local host. There is some way to inspect the config log to see what NIDs are in there, but I can’t recall the specifics off the top of my head.
Chris Horn
From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of Laura Hild via lustre-discuss <lustre-discuss at lists.lustre.org>
Date: Thursday, November 30, 2023 at 8:22 AM
To: Philipp Grau <phgrau at zedat.fu-berlin.de>
Cc: Lustre User Discussion Mailing List <lustre-discuss at lists.lustre.org>
Subject: Re: [lustre-discuss] Lustre mds/ods Server with IB/omnipath and Ethernet clients (dual homed?)
Hi Philipp-
I don't do this a ton so I'm hazy, but do you set nids or nets when you mkfs.lustre? So then maybe you have to tunefs those in when you add more?
-Laura
________________________________________
Od: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> v imenu Philipp Grau <phgrau at zedat.fu-berlin.de>
Poslano: sreda, 29. november 2023 06:37
Za: lustre-discuss at lists.lustre.org
Zadeva: [lustre-discuss] Lustre mds/ods Server with IB/omnipath and Ethernet clients (dual homed?)
Hello,
some questions regarding network connection setup for ethernet based
clients.
We have a working Luste installation with two MDS servers and seven
ODS systems connected to our cluster via omnipath/ib. This part is
working fine.
Now we want to add some clients that have only a ethernet connection
to the Lustre servers (with the ethernet cards in the servers).
Our MDS and ODS servers have the following lnet setup:
net:
- net type: lo
local NI(s):
- nid: 0 at lo
status: up
- net type: o2ib
local NI(s):
- nid: 10.149.0.XXX at o2ib # IP of the local ib interface
status: up
interfaces:
0: ib0
- net type: tcp
local NI(s):
- nid: xxx.xxx.5.XXX at tcp # IP of the local ethernet interface
status: up
interfaces:
0: eno1
Our test ethernet node:
lnetctl net show
net:
- net type: lo
local NI(s):
- nid: 0 at lo
status: up
- net type: tcp
local NI(s):
- nid: xxx.xxx.4.XXX at tcp # same subnet as above, it is a /23
status: up
interfaces:
0: enp225s0f0
So far so good.
I'm able to lnetctl ping in both directions:
Ping the client:
lnetctl ping xxx.xxx.4.xxx at tcp
ping:
- primary nid: xxx.xxx.4.xxx at tcp
Multi-Rail: True
peer ni:
- nid: xxx.xxx.4.xxx at tcp
Ping the server:
lnetctl ping xxx.xxx.5.xxx at tcp
ping:
- primary nid: xxx.xxx.5.xxx at tcp
Multi-Rail: True
peer ni:
- nid: 10.149.0.183 at o2ib
- nid: xxx.xxx.5.xxx at tcp
But the mount fails, output from dmesg (are there other sources of
debug information?):
LustreError: 25758:0:(ldlm_lib.c:494:client_obd_setup()) can't add initial connection
LustreError: 25758:0:(obd_config.c:559:class_setup()) setup scratch-MDT0000-mdc-ffff8b63003d4000 failed (-2)
LustreError: 25758:0:(obd_config.c:1835:class_config_llog_handler()) MGCxxx.xxx.5.xxx at tcp: cfg command failed: rc = -2
Lustre: cmd=cf003 0:scratch-MDT0000-mdc 1:scratch-MDT0000_UUID 2:10.149.0.183 at o2ib
LustreError: 15c-8: MGC160.45.5.246 at tcp: The configuration from log 'scratch-client' failed (-2). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
LustreError: 25734:0:(obd_config.c:610:class_cleanup()) Device 3 not setup
Lustre: Unmounted scratch-client
LustreError: 25734:0:(obd_mount.c:1604:lustre_fill_super()) Unable to mount (-2)
Does some one have some ideas or reference documentation on this topic?
Do I need some "lnetctl route" stuff?
Do I need some "lnetctl peer add ..." to make the Lustre servers and
clients known to each other?
Any hints are welcome!
Kind regards,
Philipp
--
Philipp Grau | Freie Universitaet Berlin
phgrau at ZEDAT.FU-Berlin.DE | FU-IT - Infrastruktur
Tel: +49 (30) 838 56583 | Fabeckstr. 32
Fax: +49 (30) 838 56721 | 14195 Berlin
_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org<http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20231130/92f4c43d/attachment-0001.htm>
More information about the lustre-discuss
mailing list