[lustre-discuss] Doubt about how a file is stored on the OSTs

Mohr Jr, Richard Frank (Rick Mohr) rmohr at utk.edu
Tue May 5 22:09:45 PDT 2015


> On May 6, 2015, at 12:48 AM, Prakrati.Agrawal at shell.com wrote:
> 
> Thanks for the quick reply.
> For the second question, I am taking about total number of OSTs as 165.
> So my stripe count is 165, stripe size is 1GB and total file size is 64 GB.
> I have 64 ranks on 4 nodes.
> Hence, each is writing 1GB.
> Why does my performance degrade then? What is the extra overhead that is incurred?

I don’t really know for sure, but one thing you could check would be the OST distribution for the file.  If you run “lfs getstripe” on the file and look at the first 64 OSTs, are those OSTs evenly distributed across all the servers?  Or are there a few servers with above average numbers of OSTs allocated?  If it looks like the OST distribution might be an issue, you could try creating a file with stripe_count=64 and see if that performs better.

Also keep in mind that your bottleneck could be on the client side and not the server.  In that case, just throwing more OSTs at the problem won’t necessarily help.  You might need to user more clients (maybe 8 nodes with 8 ranks instead of 4 nodes with 16 ranks).

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu



More information about the lustre-discuss mailing list