[lustre-discuss] Issue of mounting lustre through specified interface

Tung-Han Hsieh thhsieh at twcp1.phys.ntu.edu.tw
Sat Jul 4 00:26:05 PDT 2020


Dear All,

We have Lustre servers (MDS, OSS) with Lustre-2.10.7 installed, with
both tcp and o2ib interfaces:

[  193.016516] Lustre: Lustre: Build Version: 2.10.7
[  193.486408] LNet: Added LNI 192.168.62.151 at o2ib [8/256/0/180]
[  193.538200] LNet: Added LNI 192.168.60.151 at tcp [8/256/0/180]
[  193.538372] LNet: Accept secure, port 988

We have several clients, all with Lustre-2.12.4. Some have both tcp
and o2ib interfaces. These clients can mount Lustre server with o2ib
interface without any problem, i.e.,

mount -t lustre -o flock 192.168.62.151 at o2ib:/chome /home
(this is OK)

However, we have another client with Lustre-2.12.4, too, which only
has tcp interface. It cannot mount server through tcp interface:

mount -t lustre -o flock 192.168.60.151 at tcp:/chome /home
(this is failed with "Input/output error, Is the MGS running ?")

Checking the dmesg message of this client, it reads:

=========================================================================
[3106477.006512] LNetError: 15970:0:(lib-move.c:1999:lnet_handle_find_routed_path()) no route to 192.168.62.151 at o2ib from <?>
[3106483.142436] LustreError: 122230:0:(mgc_request.c:249:do_config_log_add()) MGC192.168.60.151 at tcp: failed processing log, type 1: rc = -5
[3106492.293968] LustreError: 122238:0:(mgc_request.c:599:do_requeue()) failed processing log: -5
[3106513.861586] LustreError: 15c-8: MGC192.168.60.151 at tcp: The configuration from log 'chome-client' failed (-5). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
[3106513.862052] Lustre: Unmounted chome-client
[3106513.862281] LustreError: 122230:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount  (-5)
=========================================================================

Surprisingly that, although I have specified the tcp interface to
mount, but Lustre itself still tries to mount with o2ib interface.

I also tested whether LNet works or not.
(Server NID: 192.168.60.151 at tcp, Client NID: 192.168.60.30 at tcp)

>From the server side:
# /opt/lustre/sbin/lctl ping 192.168.60.30
12345-0 at lo
12345-192.168.60.30 at tcp

>From the client side:
# /opt/lustre/sbin/lctl ping 192.168.60.151
12345-0 at lo
12345-192.168.62.151 at o2ib
12345-192.168.60.151 at tcp

Hence it looks fine.

The module options (/etc/modprobe.d/lustre.conf) for server and client are:
- Server:
  options lnet networks="o2ib0(ib0),tcp0(eth0)"
- Client:
  options lnet networks="tcp0(eth0)"

The building options for server and client are:
- Server (Lustre-2.10.7):
  ./configure --prefix=/opt/lustre \
              --with-linux=<linux_kernel_path> \
              --with-o2ib=<compat-rdma-path>

- Client (Lustre-2.12.4):
  ./configure --prefix=/opt/lustre \
              --with-linux=<linux_kernel_path> \
              --with-o2ib=no \
              --disable-server

Could anyone suggest how to solve this problem ?


Thanks very much.


T.H.Hsieh


More information about the lustre-discuss mailing list