[Lustre-discuss] [mpich-devel] ROMIO for Lustre

Rob Latham robl at mcs.anl.gov
Mon Sep 23 06:45:21 PDT 2013


On Sun, Sep 22, 2013 at 08:41:21PM -0500, Jaln wrote:
> Interesting, Thanks Rob,
> So I can assume the Hopper( a cray XE6 with MPT 3.2) contains the
> lustre-specific optimizations?

Hopper does.  Just to be clear, don't ask for 3.2: run the newest
possible MPT.

You'll have to experiment a bit with the settings
described in the intro_mpi man page.  In my experience, things that
are documented as the default are not actually the default.  It's
quite frustrating, but the hopper staff is quite good at answering
site-specific MPI-IO questions.

> Does it work both for read and write?

I don't know for sure (it's a Cray, and I have not see the source
code), but most of the focus was on writes since in the write path
there are lustre-level locks to deal with.  In the read path,
lustre-level locks don't really come into play.

==rob

> Jailin
> 
> 
> On Sun, Sep 22, 2013 at 2:00 PM, Rob Latham <robl at mcs.anl.gov> wrote:
> 
> > On Sat, Sep 21, 2013 at 11:21:19PM -0500, Jaln wrote:
> > > Hi everyone,
> > >
> > > I'm not sure, whether the lustre or the MPI forum is the right place for
> > my
> > > question.
> >
> > both, i guess :>
> >
> > > The question is about the ROMIO optimization on Lustre,
> > > In one SC'08 paper,
> > > http://users.eecs.northwestern.edu/~wkliao/PAPERS/fd_sc08_revised.pdf<
> > https://mail.ttu.edu/owa/redir.aspx?C=yUmbVUH4hUWLFEWFA2GcoiKOEhnhitAIatZfGT92-aN2MTXitjDjPgfE9EfJkJF9q3XAaOQ_iME.&URL=http%3a%2f%2fusers.eecs.northwestern.edu%2f%7ewkliao%2fPAPERS%2ffd_sc08_revised.pdf
> > >
> > > , it's said that the way ROMIO assigns the file domains to I/O
> > aggregators
> > > will not make two aggregators access the same OST.
> > >
> > > In my understanding, this means, the data locality on Lustre layer has
> > been
> > > taken care of in the ROMIO, such that the aggregators will not
> > > have competition on the same OST.
> > >
> > > My question is "is this optimization used in all current lustre system,
> > > e.g., Hopper at NERSC?"
> >
> > Wei-keng never contributed the specific ROMIO optimizations he discussed in
> > the SC 08 paper, but his work did spur a lot of community discussion
> > and contributions.
> >
> > Emoly Lu contributed a bunch of Lustre ADIO driver work, which Pascal
> > Deveze and Martin Pokorny improved upon.   MPICH-1.3 and newer contain
> > these improvements.
> >
> > David Knaak from Cray implemented his own improvements.  Cray's MPI-IO
> > is based on ROMIO but the cray modifications are proprietary. MPT-3.2
> > and newer contain lustre-specific optimizations.
> >
> > The community has been quiet with respect to Lustre MPI-IO work since
> > then.  I hope that's because everything "just works".
> >
> > ==rob
> >
> > --
> > Rob Latham
> > Mathematics and Computer Science Division
> > Argonne National Lab, IL USA
> >
> 
> 
> 

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA



More information about the lustre-discuss mailing list