[Lustre-discuss] Read/Write performance problem

Michael Kluge Michael.Kluge at tu-dresden.de
Tue Oct 6 04:24:36 PDT 2009


Hi all,

our Lustre FS shows an interesting performance problem which I'd like to
discuss as some of you might have seen this kind of things before and
maybe someone has a quick explanation of what's going on.

We are running Lustre 1.6.5.1. The problem shows up when we read a
shared file from multiple nodes that has just been written from the same
set of nodes. 512 processes write a checkpoint (1.5 GB from each node)
into a shared file by seeking to position RANK*1.5GB and writing 1.5GB
in 1.44M chunks. Writing works fine and gives the full file system
performance. The data is being written by using write() and no flags
aside O_CREAT and O_WRONLY. If the checkpoint is written, the program is
terminated and restarted and reads in the same portion of the file. For
some reason this almost immediate reading of the same data that was just
written on the same node is very slow. If we a) change the set of nodes
or b) wait a day, we get the full read performance when we use the same
executable and the same shared file. 

Is there a reason why an immediate read after a write on the same node
from/to a shared file is slow? Is there any additional communication,
e.g. is the client flushing the buffer cache before the first read? The
statistics show that the average time to complete a 1.44MB read request
is increasing during the runtime of our program. At some point it hits
an upper limit or a saturation point and stays there. Is there some kind
of queue or something that is getting full in this kind of
write/read-scenario? May tuneable some stuff in /proc/fs/luste?


Regards, Michael


-- 

Michael Kluge, M.Sc.

Technische Universität Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5997 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20091006/e3c41089/attachment.bin>


More information about the lustre-discuss mailing list