[Lustre-discuss] Errors on MDS server

Oleg Drokin Oleg.Drokin at Sun.COM
Wed Nov 14 10:01:17 PST 2007


Hello!

On Nov 14, 2007, at 10:57 AM, Wojciech Turek wrote:
> syslog on MDS server
> Nov 14 14:00:00 mds01 kernel: LustreError: 23667:0:(mds_open.c: 
> 1474:mds_close()) @@@ no handle for file close ino 24640839: cookie  
> 0xc5c8c9ecaec0f5ca  req at 0000010063ee3000 x319994/t0 o35- 
> >e5064fef-6af2-c57f-9688-7f8cf8cb41a7 at NET_0x200010a8e0a65_UUID:-1  
> lens 296/560 ref 0 fl Interpret:/0/0 rc 0/0
> Nov 14 14:00:00 mds01 kernel: LustreError: 23667:0:(ldlm_lib.c: 
> 1437:target_send_reply_msg()) @@@ processing error (-116)   
> req at 0000010063ee3000 x319994/t0 o35->e5064fef-6af2- 
> c57f-9688-7f8cf8cb41a7 at NET_0x200010a8e0a65_UUID:-1 lens 296/560 ref  
> 0 fl Interpret:/0/0 rc -116/0
> Nov 14 14:00:00 mds01 kernel: LustreError: 23667:0:(ldlm_lib.c: 
> 1437:target_send_reply_msg()) Skipped 4 previous similar messages
> Nov 14 14:04:32 mds01 kernel: LustreError: 23665:0:(mds_open.c: 
> 1474:mds_close()) @@@ no handle for file close ino 24640840: cookie  
> 0xc5c8c9ecaec10f38  req at 00000100cf997200 x324461/t0 o35- 
> >e5064fef-6af2-c57f-9688-7f8cf8cb41a7 at NET_0x200010a8e0a65_UUID:-1  
> lens 296/560 ref 0 fl Interpret:/0/0 rc 0/0
> Nov 14 14:04:32 mds01 kernel: LustreError: 23665:0:(ldlm_lib.c: 
> 1437:target_send_reply_msg()) @@@ processing error (-116)   
> req at 00000100cf997200 x324461/t0 o35->e5064fef-6af2- 
> c57f-9688-7f8cf8cb41a7 at NET_0x200010a8e0a65_UUID:-1 lens 296/560 ref  
> 0 fl Interpret:/0/0 rc -116/0
>
> syslog on relevant client
> Nov 14 14:00:00 bindloe01 kernel: LustreError: 11-0: an error  
> occurred while communicating with 10.142.10.201 at tcp1. The mds_close  
> operation failed with -116
> Nov 14 14:00:00 bindloe01 kernel: LustreError: 28428:0:(file.c: 
> 97:ll_close_inode_openhandle()) inode 24640839 mdc close failed: rc  
> = -116
> Nov 14 14:04:32 bindloe01 kernel: LustreError: 11-0: an error  
> occurred while communicating with 10.142.10.201 at tcp1. The mds_close  
> operation failed with -116
> Nov 14 14:04:32 bindloe01 kernel: LustreError: 28429:0:(file.c: 
> 97:ll_close_inode_openhandle()) inode 24640840 mdc close failed: rc  
> = -116

Sounds like there was previous client eviction nd evicted client had  
some files open.
On eviction MDS closes all file handles of ths client, but client  
thinks handles are still valid
After the client reconnected and tried to do something with those  
handles (closing the files), it discovers that file handles are no  
longer valid and those messages are printed.

> Could some one give me or point me to the information about a source  
> of errors above ?
> What error -116 means ?

errno 116 is ESTALE, you can see those values in /usr/include/asm- 
generic/errno.h and errno-base.h (for smaller values).

Bye,
      Oleg




More information about the lustre-discuss mailing list