<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
Hi,<br>
<br>
We also stumbled into this. It is described here:<br>
<br>
<a class="moz-txt-link-freetext" href="https://jira.whamcloud.com/browse/LU-11840">https://jira.whamcloud.com/browse/LU-11840</a><br>
<br>
The best workaround we found was to disable discovery on 2.12
clients:<br>
<br>
# lnetctl set discovery 0<br>
<br>
Cheers,<br>
Hans Henrik<br>
<br>
<div class="moz-cite-prefix">On 04.07.2020 09.26, Tung-Han Hsieh
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:20200704072605.GA2594@twcp1.phys.ntu.edu.tw">
<pre class="moz-quote-pre" wrap="">Dear All,
We have Lustre servers (MDS, OSS) with Lustre-2.10.7 installed, with
both tcp and o2ib interfaces:
[ 193.016516] Lustre: Lustre: Build Version: 2.10.7
[ 193.486408] LNet: Added LNI 192.168.62.151@o2ib [8/256/0/180]
[ 193.538200] LNet: Added LNI 192.168.60.151@tcp [8/256/0/180]
[ 193.538372] LNet: Accept secure, port 988
We have several clients, all with Lustre-2.12.4. Some have both tcp
and o2ib interfaces. These clients can mount Lustre server with o2ib
interface without any problem, i.e.,
mount -t lustre -o flock 192.168.62.151@o2ib:/chome /home
(this is OK)
However, we have another client with Lustre-2.12.4, too, which only
has tcp interface. It cannot mount server through tcp interface:
mount -t lustre -o flock 192.168.60.151@tcp:/chome /home
(this is failed with "Input/output error, Is the MGS running ?")
Checking the dmesg message of this client, it reads:
=========================================================================
[3106477.006512] LNetError: 15970:0:(lib-move.c:1999:lnet_handle_find_routed_path()) no route to 192.168.62.151@o2ib from <?>
[3106483.142436] LustreError: 122230:0:(mgc_request.c:249:do_config_log_add()) MGC192.168.60.151@tcp: failed processing log, type 1: rc = -5
[3106492.293968] LustreError: 122238:0:(mgc_request.c:599:do_requeue()) failed processing log: -5
[3106513.861586] LustreError: 15c-8: MGC192.168.60.151@tcp: The configuration from log 'chome-client' failed (-5). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
[3106513.862052] Lustre: Unmounted chome-client
[3106513.862281] LustreError: 122230:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount (-5)
=========================================================================
Surprisingly that, although I have specified the tcp interface to
mount, but Lustre itself still tries to mount with o2ib interface.
I also tested whether LNet works or not.
(Server NID: 192.168.60.151@tcp, Client NID: 192.168.60.30@tcp)
>From the server side:
# /opt/lustre/sbin/lctl ping 192.168.60.30
12345-0@lo
12345-192.168.60.30@tcp
>From the client side:
# /opt/lustre/sbin/lctl ping 192.168.60.151
12345-0@lo
12345-192.168.62.151@o2ib
12345-192.168.60.151@tcp
Hence it looks fine.
The module options (/etc/modprobe.d/lustre.conf) for server and client are:
- Server:
options lnet networks="o2ib0(ib0),tcp0(eth0)"
- Client:
options lnet networks="tcp0(eth0)"
The building options for server and client are:
- Server (Lustre-2.10.7):
./configure --prefix=/opt/lustre \
--with-linux=<linux_kernel_path> \
--with-o2ib=<compat-rdma-path>
- Client (Lustre-2.12.4):
./configure --prefix=/opt/lustre \
--with-linux=<linux_kernel_path> \
--with-o2ib=no \
--disable-server
Could anyone suggest how to solve this problem ?
Thanks very much.
T.H.Hsieh
_______________________________________________
lustre-discuss mailing list
<a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a>
<a class="moz-txt-link-freetext" href="http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org">http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org</a>
</pre>
</blockquote>
<br>
</body>
</html>