[Lustre-discuss] [ROMIO Req #940] [Fwd: Re: [Lustre-devel] a new Lustre ADIO driver]

Robert Latham robl at mcs.anl.gov
Wed Mar 18 12:36:29 PDT 2009

On Mon, Mar 16, 2009 at 03:41:47PM +0800, emoly.liu wrote:
> Robert Latham wrote:
>> Second favor: documentation... Can you send me a brief summary of
>> the new hints?    
> Sure

Thanks for the documentation.  These explanations are good, but now
we've found a few problems.  The naming issues are rather minor, but
some of your hints aren't compliant with the MPI-IO spec,

>> romio_lustre_CO
> In stripe-contiguous IO pattern, each OST will be accessed by a group of  
> IO clients. CO means *C*lient/*O*ST ratio, the max. number of IO clients  
> for each OST.
> CO=1 by default.

To make it more clear, how about calling it "romio_lustre_co_ratio" ?

>> romio_lustre_bigsize
> We won't do collective I/O if this hint is set and the IO request size  
> is bigger than this value. That's because when the request size is big,  
> the collective communication overhead increases and the benefits from  
> collective I/O becomes limited.

Instead of 'bigzise' how about "romio_lustre_coll_highwater" or

>> romio_lustre_contig_data
>> romio_lustre_samesize
> They are two hints to tell the driver whether the request data are
> contiguous and whether each request IO has the same size.  If they
> are both "yes", we can optimize ADIOI_LUSTRE_Calc_others_req()  by
> removing MPI_Alltoall(). Because each process can easily calculate
> the pairs of offset and length for each request without collective
> communication.  BTW, currently only when they are both positive, the
> optimization can  work. In the future, probably some efforts will be
> made to other  conditions.

OK, here's the one with the major problem.  RobR reminds me that
MPI-IO requires hints to be optional and cannot cause incorrect
behavior.  A user supplying these hints and then giving you data that
is noncontiguous or not of the same size would cause incorrect
behavior, so these aren't appropriate.

Is there a way you can check what the caller is doing?  caller can lie
to you via hints, but ROMIO still has to give the right answer.  RobR
thought maybe MPI_Allreduce or something along those lines before the
MPI_Alltoall would let you check.

Your other hints make a lot of intuitive sense to me.  Is this one a
big win, though?  If MPI_Alltoall is giving you a big headache, then
maybe there is a more fundamental problem with the MPI implementation?


Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA                 B29D F333 664A 4280 315B

More information about the lustre-discuss mailing list