[lustre-discuss] lustre mount in heterogeneous net environment
Ms. Megan Larko
dobsonunit at gmail.com
Tue Feb 27 12:08:46 PST 2018
Hello List!
We have some 2.7.18 lustre servers using TCP. Through some dual-homed
Lustre LNet routes we desire to connect some Mellanox (mlx4) InfiniBand
Lustre 2.7.0 clients.
The "lctl ping" command works from both the server co-located MGS/MDS and
from the client.
The mount of the TCP lustre server share from the IB client starts and then
shortly thereafter fails with "Input/output error Is the MGS running?"
The Lustre MDS at approximate 20 min. intervals from client mount request
/var/log/messages reports:
Lustre: MGS: Client <string> (at A.B.C.D at o2ib) reconnecting
The IB client mount command:
mount -t lustre C.D.E.F at tcp0:/lustre /mnt/lustre
Waits about a minute then returns:
mount.lustre C.D.E.F at tcp0:/lustre at /mnt/lustre failed: Input/output error
Is the MGS running?.
The IB client /var/log/messages file contains:
Lustre: client.c:19349:ptlrpc_expire_one_request(()) @@@ Request sent has
timed out for slow reply ...... -->MGCC.D.E.F at tcp was lost; in progress
operations using this service will fail
LustreError: 15c-8: MGCC.D.E.F at tcp: The configuration from log
'lustre-client' failed (-5) This may be the result of communication errors
between this node and the MGS, a bad configuration, or other errors. See
the syslog for more information.
Lustre: MGCC.D.E.F at tcp: Connection restored to MGS (at C.D.E.F at tcp)
Lustre: Unmounted lustre-client
LustreError: 22939:0:(obd_mount.c:lustre_fill_super()) Unable to mount (-5)
We have not (yet) set any non-default values on the Lustre File System.
* Server: Lustre 2.7.18 CentOS Linux release 7.3.1611 (Core) kernel
3.10.0-514.2.2.el7_lustre.x86_64 The server is ethernet; no IB.
* Client: Lustre-2.7.0 RHEL 6.8 kernel 2.6.32-696.3.2.el6.x86_64 The
client uses Mellanox InfiniBand mlx4.
The mount point does exist on the client. The firewall is not an issue;
checked. SELinux is disabled.
NOTE: The server does server the same /lustre file system to other TCP
Lustre clients.
The client does mount other /lustre_mnt from other IB servers.
The info on
http://wiki.lustre.org/Mounting_a_Lustre_File_System_on_Client_Nodes
describes the situation exceedingly similar to ours. I'm not sure what
Lustre settings to check if I have not explicitly set any to be different
that the default value.
Any hints would be genuinely appreciated.
Cheers,
megan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180227/38605920/attachment.html>
More information about the lustre-discuss
mailing list