[Lustre-discuss] Performance problems with Lustre 1.6.1
Juan Piernas Canovas
juan.piernascanovas at pnl.gov
Mon Oct 1 15:56:33 PDT 2007
Hi all,
I have set up a small Lustre file system with 1 MDS and 8 OSS/OST. The
particularity of our system is that every OSS is also a client of the
file system (there are 8 clients altogether).
The file system has a 1 GB file striped across all the OSTs. On every
OST, there is a process which reads the file chunks stored locally,
e.g., in its own OST (since the processes have the striping information
of the file, each one knows which portions of the file are stored in its
OST).
The problem that I have is that, when the stripe size is 1MB (what means
that there are 1024 chunks in total, or 128 chunks per OST), it takes
more than 400 seconds to read the file, and the network traffic is very
high. However, if the stripe size is 128 MB (8 chunks altogether, one
per OST), it takes only around 100 seconds to read the file, and the
network traffic is 1/10th the previous one. Note that, in both cases,
the data I/O operations are local and that the processes read the same
amount of data.
Could this be a problem with the lock mechanism and the caching on the
clients? If so, I have seen that the ldlm can be disabled, but, how?
(The processes read from disjoint parts of the file, so they do not
really need the ldlm service).
Thanks in advance,
Juan.
More information about the lustre-discuss
mailing list