[lustre-discuss] Doubt about how a file is stored on the OSTs

Mohr Jr, Richard Frank (Rick Mohr) rmohr at utk.edu
Tue May 5 21:32:58 PDT 2015


> On May 6, 2015, at 12:05 AM, Prakrati.Agrawal at shell.com wrote:
> 
> I am doing some performance benchmarking on Lustre file system. To understand my results, I wanted to know how a file is written on the OSTs.
> 
> Following is what I am doing:
> 
> I have a file of 64 GB to be written
> 
> number of ranks is 64
> 
> number of nodes is 4
> 
> stripe count is 4
> 
> stripe size is 1GB
> 
>  
> 
> Let the 4 OSTs be OST1, OST2, OST3, and OST4.
> 
> Let the nodes be N1, N2, N3 and N4.
> 
> So each rank is writing 1 GB to 1 of the 4 OSTs.
> 
> What I want to know is that, since 16 ranks are writing 1GB from say N1, are all those ranks writing to OST1 only or it might be the case that out ranks, some are writing to OST1, some to OST2 and so on.

The way the file data will be organized is like this:

1st GB -> OST1
2nd GB -> OST2
3rd GB -> OST3
4th GB -> OST4
5th GB -> OST1
6th GB -> OST2
….

Depending upon which sections of the file the 16 processes on node N1 are writing to, they may or may not all write to the same OST.  If you are using SMP-style placement and assuming that rank N writes the (N+1)st GB of data, then each node would have 4 processes writing to each OST.

> Also, if I increase my stripe count i.e number of OSTs to total number of OSTs, but each rank is still writing 1GB and total ranks are 64, why does performance degrade?

It’s hard to venture a guess about performance without knowing more about the file system (total number of OSTs, total number of servers, interconnect technology, etc.)  The I/O pattern of the application itself can also play a role.  In general though, if you had a file with stripe_count=64, each process should be writing to its own OST which should reduce contention and improve performance (assuming that there are not other application running concurrently that could affect I/O).

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu



More information about the lustre-discuss mailing list