[Lustre-discuss] lock timeouts and OST evictions on 1.4 server - 1.6 client system.

Oleg Drokin Oleg.Drokin at Sun.COM
Tue Feb 10 06:21:53 PST 2009


Hello!

On Feb 10, 2009, at 5:17 AM, Simon Kelley wrote:
>
> Feb  9 14:05:30 sf-2-3-10 kernel: LustreError: 11-0: an error occurred
> while communicating with 172.31.96.96 at tcp. The obd_ping operation  
> failed
> with -107
> Feb  9 14:05:30 sf-2-3-10 kernel: LustreError: Skipped 12 previous
> similar messages
> Feb  9 14:05:30 sf-2-3-10 kernel: Lustre:
> OSC_sf2-sfs2.internal.sanger.ac.uk_sf2-sfs-ost495_MNT_client_tcp- 
> ffff81021f897800:
>
> Connection to service sf2-sfs-ost495 via nid 172.31.96.96 at tcp was  
> lost;
> in progress operations using this service will wait for recovery to
> complete.
> Feb  9 14:05:30 sf-2-3-10 kernel: Lustre: Skipped 4 previous similar
> messages
> Feb  9 14:05:30 sf-2-3-10 kernel: LustreError: 167-0: This client was
> evicted by sf2-sfs-ost495; in progress operations using this service
> will fail.
>

What would be useful here is if you can enable dlm tracing (echo  
+dlm_trace >/proc/sys/lnet/debug)
on some of those 1.6 nodes (also if you are running with no debug  
enabled at all,
also enable rpc_trace and info levels) and also enable "dump on  
eviction" feature.
(echo 1 >/proc/sys/lustre/dump_on_eviction).
Then when next eviction happens, there would be some useful debug data  
dumped on the client,
that you can attach to a bugzilla bug along with server-side eviction  
message (processed
with "lctl dl" command first).

> We are also seeing some userspace file operations fail with the error
> "No locks available". These don't generate any logging on the client  
> so
> I don't have exact timing. It's possible that they are associated with
> further "### lock callback timer expired" server logs.

This error code typically means an application attempting to do some i/ 
o and Lustre
has no lock for the i/o area for some reason anymore (it is normally  
obtained
once read or write path is entered), and that could be related to  
evictions too
(locks are revoked at eviction time).

Bye,
     Oleg



More information about the lustre-discuss mailing list