[Lustre-discuss] One client node freezes at random

Jeremy Mann jeremy at biochem.uthscsa.edu
Tue Jul 7 07:37:43 PDT 2009


Lustre 1.6.7.1 - Kernel 2.6.22.14

I have one client that randomly looses its lustre mount. I can still SSH
to the client and df reports "df: `/lustre: Input/output error". However
dmesg, /var/log/kern and /var/log/message do not show any kind of error
that can tell me why its losing the lustre mount. When I try to manually
unmount /lustre on the client, it just hangs and does nothing.

Output on the MGS node (when I manually try to unmount the client) is:

LustreError: 14565:0:(handler.c:1601:mds_handle()) operation 41 on
unconnected MDS from 12345-192.168.243.237 at tcp
LustreError: 14565:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@
processing error (-107)  req at ffff8100782b8c00 x28223/t0 o41-><?>@<?>:0/0
lens 128/0 e 0 to 0 dl 1246974648 ref 1 fl Interpret:/0/0 rc -107/0
Lustre: laredofs-MDT0000: haven't heard from client
b02c9752-a096-d2b1-e266-e1f95f9b9b5a (at 192.168.243.237 at tcp) in 228
seconds. I think it's dead, and I am evicting it.



-- 
Jeremy Mann
jeremy at biochem.uthscsa.edu

University of Texas Health Science Center
Bioinformatics Core Facility
http://www.bioinformatics.uthscsa.edu
Phone: (210) 567-2672




More information about the lustre-discuss mailing list