[Lustre-discuss] lock completion timeouts?

Andreas Dilger adilger at sun.com
Thu Nov 20 20:57:22 PST 2008


On Nov 13, 2008  01:12 -0800, daledude wrote:
> * 8tb backup lustre fs using centos 5.2 64bit + lustre 1.6.6. MDT/MGS/
> OST/ALL all on a single server with 4gb memory. I mount the 20tb
> lustre fs on this machine and also run the rsync on it.

This is documented as an unsupported configuration, mainly due to the
risk of a client thread flushing dirty data under memory pressure waiting
for an OST thread trying to write the data to disk, but needing to allocate
memory to complete the write...

> Nov 13 00:42:42 mds kernel: Lustre: Request x148006922 sent from
> mybackup-OST0000 to NID 0 at lo 7s ago has timed out (limit 6s).
> Nov 13 00:42:42 mds kernel: LustreError: 11-0: an error occurred while
> communicating with 0 at lo. The ost_write operation failed with -107
> Nov 13 00:42:42 mds kernel: Lustre: mybackup-OST0000-osc-f27eac00:
> Connection to service mybackup-OST0000 via nid 0 at lo was lost; in
> progress operations using this service will wait for recovery to
> complete.

There appears to be a timeout communicating from the local machine to
itself (0 at lo).  That can't possibly be due to "network" problems because 
the lustre "loopback" is network is simply "memcpy".  It is possible that
if the machine is overloaded that some threads are just taking too long
to be scheduled due to the many server threads on the system.

You could try increasing the the /proc/sys/lustre/ldlm_timeout value to
see if this helps.  You could also try limiting the number of server
threads running on the system by putting options in /etc/modprobe.conf:

option mds mds_num_threads=32
option ost oss_num_threads=32

That will reduce the contention on the node and allow the threads
to be run more frequently.

Another, better, solution is to mount the 8TB filesystem on one of
the nodes that is also mounting the 20TB filesystem and run rsync
there.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the lustre-discuss mailing list