[Lustre-discuss] Performance problems with Lustre 1.6.1

Mon Oct 1 23:01:46 PDT 2007

On Oct 01, 2007  15:56 -0700, Juan Piernas Canovas wrote:
> I have set up a small Lustre file system with 1 MDS and 8 OSS/OST. The 
> particularity of our system is that every OSS is also a client of the 
> file system (there are 8 clients altogether).
> 
> The file system has a 1 GB file striped across all the OSTs. On every 
> OST, there is a process which reads the file chunks stored locally, 
> e.g., in its own OST (since the processes have the striping information 
> of the file, each one knows which portions of the file are stored in its 
> OST).
> 
> The problem that I have is that, when the stripe size is 1MB (what means 
> that there are 1024 chunks in total, or 128 chunks per OST), it takes 
> more than 400 seconds to read the file, and the network traffic is very 
> high. However, if the stripe size is 128 MB (8 chunks altogether, one 
> per OST), it takes only around 100 seconds to read the file, and the 
> network traffic is 1/10th the previous one. Note that, in both cases, 
> the data I/O operations are local and that the processes read the same 
> amount of data.

It sounds like the readahead is reading the "unused" parts of the file
on the other OSTs.  Are you also reading data from disk in 1MB chunks,
or in smaller chunks?  You should read at the stripe size for best
performance in this test.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.