[Lustre-discuss] Client Eviction Preceded by EHOSTUNREACH and then ENOTCONN?

Isaac Huang isaac_huang at xyratex.com
Tue Jul 12 11:32:27 PDT 2011

On Tue, Jul 12, 2011 at 11:06:40AM -0700, Rick Wagner wrote:
> On Jul 12, 2011, at 11:01 AM, Isaac Huang wrote:
> > On Mon, Jul 11, 2011 at 03:39:34PM -0700, Rick Wagner wrote:
> >> Hi,
> >> ......
> >> I am assuming that -113 is EHOSTUNREACH and -107 is ENOTCONN, and that the error codes from errno.h are being used.
> >> 
> >> We've been experiencing similar problems for a while, and we've never seen IP traffic have a problem. But, clients will begin to have trouble communicating with the Lustre server (seen because an LNET ping will return an I/O error), and things will only recover when an LNET ping is performed from the server to the client NID.
> > 
> > I'd suggest to enable console logging of network errors, by 'echo
> > +neterror > /proc/sys/lnet/printk'. Then some detailed debug messages
> > should show up in 'dmesg' when you have LNET connectivity problems.
> Thanks, Isaac, I have put that in place. We have that in the sysctl configuration, as part of lnet.debug, and thought that was sufficient. But so far, dmesg and /var/log/messages have looked very similar.
> [root at lustre-oss-0-2 ~]# cat /proc/sys/lnet/printk 
> warning error emerg console

You should be able to see 'neterror' in 'cat /proc/sys/lnet/printk'
output after 'echo +neterror > /proc/sys/lnet/printk', otherwise
it's a bug. This is different from lnet.debug.

> [root at lustre-oss-0-2 ~]# sysctl -a | grep neterr
> lnet.debug = ioctl neterror net warning error emerg ha config console

- Isaac
This email may contain privileged or confidential information, which should only be used for the purpose for which it was sent by Xyratex. No further rights or licenses are granted to use such information. If you are not the intended recipient of this message, please notify the sender by return and delete it. You may not use, copy, disclose or rely on the information contained in it.
Internet email is susceptible to data corruption, interception and unauthorised amendment for which Xyratex does not accept liability. While we have taken reasonable precautions to ensure that this email is free of viruses, Xyratex does not accept liability for the presence of any computer viruses in this email, nor for any losses caused as a result of viruses.
Xyratex Technology Limited (03134912), Registered in England & Wales, Registered Office, Langstone Road, Havant, Hampshire, PO9 1SA.
The Xyratex group of companies also includes, Xyratex Ltd, registered in Bermuda, Xyratex International Inc, registered in California, Xyratex (Malaysia) Sdn Bhd registered in Malaysia, Xyratex Technology (Wuxi) Co Ltd registered in The People's Republic of China and Xyratex Japan Limited registered in Japan.

More information about the lustre-discuss mailing list