[lustre-discuss] Issue of mounting lustre through specified interface

Hans Henrik Happe happe at nbi.dk
Sat Jul 4 14:29:12 PDT 2020


Hi,

We also stumbled into this. It is described here:

https://jira.whamcloud.com/browse/LU-11840

The best workaround we found was to disable discovery on 2.12 clients:

# lnetctl set discovery 0

Cheers,
Hans Henrik

On 04.07.2020 09.26, Tung-Han Hsieh wrote:
> Dear All,
>
> We have Lustre servers (MDS, OSS) with Lustre-2.10.7 installed, with
> both tcp and o2ib interfaces:
>
> [  193.016516] Lustre: Lustre: Build Version: 2.10.7
> [  193.486408] LNet: Added LNI 192.168.62.151 at o2ib [8/256/0/180]
> [  193.538200] LNet: Added LNI 192.168.60.151 at tcp [8/256/0/180]
> [  193.538372] LNet: Accept secure, port 988
>
> We have several clients, all with Lustre-2.12.4. Some have both tcp
> and o2ib interfaces. These clients can mount Lustre server with o2ib
> interface without any problem, i.e.,
>
> mount -t lustre -o flock 192.168.62.151 at o2ib:/chome /home
> (this is OK)
>
> However, we have another client with Lustre-2.12.4, too, which only
> has tcp interface. It cannot mount server through tcp interface:
>
> mount -t lustre -o flock 192.168.60.151 at tcp:/chome /home
> (this is failed with "Input/output error, Is the MGS running ?")
>
> Checking the dmesg message of this client, it reads:
>
> =========================================================================
> [3106477.006512] LNetError: 15970:0:(lib-move.c:1999:lnet_handle_find_routed_path()) no route to 192.168.62.151 at o2ib from <?>
> [3106483.142436] LustreError: 122230:0:(mgc_request.c:249:do_config_log_add()) MGC192.168.60.151 at tcp: failed processing log, type 1: rc = -5
> [3106492.293968] LustreError: 122238:0:(mgc_request.c:599:do_requeue()) failed processing log: -5
> [3106513.861586] LustreError: 15c-8: MGC192.168.60.151 at tcp: The configuration from log 'chome-client' failed (-5). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
> [3106513.862052] Lustre: Unmounted chome-client
> [3106513.862281] LustreError: 122230:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount  (-5)
> =========================================================================
>
> Surprisingly that, although I have specified the tcp interface to
> mount, but Lustre itself still tries to mount with o2ib interface.
>
> I also tested whether LNet works or not.
> (Server NID: 192.168.60.151 at tcp, Client NID: 192.168.60.30 at tcp)
>
> From the server side:
> # /opt/lustre/sbin/lctl ping 192.168.60.30
> 12345-0 at lo
> 12345-192.168.60.30 at tcp
>
> From the client side:
> # /opt/lustre/sbin/lctl ping 192.168.60.151
> 12345-0 at lo
> 12345-192.168.62.151 at o2ib
> 12345-192.168.60.151 at tcp
>
> Hence it looks fine.
>
> The module options (/etc/modprobe.d/lustre.conf) for server and client are:
> - Server:
>   options lnet networks="o2ib0(ib0),tcp0(eth0)"
> - Client:
>   options lnet networks="tcp0(eth0)"
>
> The building options for server and client are:
> - Server (Lustre-2.10.7):
>   ./configure --prefix=/opt/lustre \
>               --with-linux=<linux_kernel_path> \
>               --with-o2ib=<compat-rdma-path>
>
> - Client (Lustre-2.12.4):
>   ./configure --prefix=/opt/lustre \
>               --with-linux=<linux_kernel_path> \
>               --with-o2ib=no \
>               --disable-server
>
> Could anyone suggest how to solve this problem ?
>
>
> Thanks very much.
>
>
> T.H.Hsieh
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20200704/6ecbd05c/attachment.html>


More information about the lustre-discuss mailing list