[lustre-discuss] Doubt about how a file is stored on the OSTs

Prakrati.Agrawal at shell.com Prakrati.Agrawal at shell.com
Tue May 5 21:48:38 PDT 2015


Hi,

Thanks for the quick reply.
For the second question, I am taking about total number of OSTs as 165.
So my stripe count is 165, stripe size is 1GB and total file size is 64 GB.
I have 64 ranks on 4 nodes.
Hence, each is writing 1GB.
Why does my performance degrade then? What is the extra overhead that is incurred?

Thanks,
Prakrati

-----Original Message-----
From: Mohr Jr, Richard Frank (Rick Mohr) [mailto:rmohr at utk.edu] 
Sent: Wednesday, May 06, 2015 10:03 AM
To: Agrawal, Prakrati PTIN-PTT/ICOE
Cc: lustre-discuss at lists.lustre.org
Subject: Re: [lustre-discuss] Doubt about how a file is stored on the OSTs


> On May 6, 2015, at 12:05 AM, Prakrati.Agrawal at shell.com wrote:
> 
> I am doing some performance benchmarking on Lustre file system. To understand my results, I wanted to know how a file is written on the OSTs.
> 
> Following is what I am doing:
> 
> I have a file of 64 GB to be written
> 
> number of ranks is 64
> 
> number of nodes is 4
> 
> stripe count is 4
> 
> stripe size is 1GB
> 
>  
> 
> Let the 4 OSTs be OST1, OST2, OST3, and OST4.
> 
> Let the nodes be N1, N2, N3 and N4.
> 
> So each rank is writing 1 GB to 1 of the 4 OSTs.
> 
> What I want to know is that, since 16 ranks are writing 1GB from say N1, are all those ranks writing to OST1 only or it might be the case that out ranks, some are writing to OST1, some to OST2 and so on.

The way the file data will be organized is like this:

1st GB -> OST1
2nd GB -> OST2
3rd GB -> OST3
4th GB -> OST4
5th GB -> OST1
6th GB -> OST2
….

Depending upon which sections of the file the 16 processes on node N1 are writing to, they may or may not all write to the same OST.  If you are using SMP-style placement and assuming that rank N writes the (N+1)st GB of data, then each node would have 4 processes writing to each OST.

> Also, if I increase my stripe count i.e number of OSTs to total number of OSTs, but each rank is still writing 1GB and total ranks are 64, why does performance degrade?

It’s hard to venture a guess about performance without knowing more about the file system (total number of OSTs, total number of servers, interconnect technology, etc.)  The I/O pattern of the application itself can also play a role.  In general though, if you had a file with stripe_count=64, each process should be writing to its own OST which should reduce contention and improve performance (assuming that there are not other application running concurrently that could affect I/O).

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences http://www.nics.tennessee.edu



More information about the lustre-discuss mailing list