[Lustre-discuss] [ROMIO Req #940] [Fwd: Re: [Lustre-devel] a new Lustre ADIO driver]

Robert Latham robl at mcs.anl.gov
Wed Mar 18 12:36:29 PDT 2009


On Mon, Mar 16, 2009 at 03:41:47PM +0800, emoly.liu wrote:
> Robert Latham wrote:
>> Second favor: documentation... Can you send me a brief summary of
>> the new hints?    
> Sure

Thanks for the documentation.  These explanations are good, but now
we've found a few problems.  The naming issues are rather minor, but
some of your hints aren't compliant with the MPI-IO spec,
unfortunately.

>> romio_lustre_CO
>>   
> In stripe-contiguous IO pattern, each OST will be accessed by a group of  
> IO clients. CO means *C*lient/*O*ST ratio, the max. number of IO clients  
> for each OST.
> CO=1 by default.

To make it more clear, how about calling it "romio_lustre_co_ratio" ?

>> romio_lustre_bigsize
>>   
> We won't do collective I/O if this hint is set and the IO request size  
> is bigger than this value. That's because when the request size is big,  
> the collective communication overhead increases and the benefits from  
> collective I/O becomes limited.

Instead of 'bigzise' how about "romio_lustre_coll_highwater" or
"romio_lustre_coll_threshold"?

>> romio_lustre_contig_data
>> romio_lustre_samesize
>>   
> They are two hints to tell the driver whether the request data are
> contiguous and whether each request IO has the same size.  If they
> are both "yes", we can optimize ADIOI_LUSTRE_Calc_others_req()  by
> removing MPI_Alltoall(). Because each process can easily calculate
> the pairs of offset and length for each request without collective
> communication.  BTW, currently only when they are both positive, the
> optimization can  work. In the future, probably some efforts will be
> made to other  conditions.

OK, here's the one with the major problem.  RobR reminds me that
MPI-IO requires hints to be optional and cannot cause incorrect
behavior.  A user supplying these hints and then giving you data that
is noncontiguous or not of the same size would cause incorrect
behavior, so these aren't appropriate.

Is there a way you can check what the caller is doing?  caller can lie
to you via hints, but ROMIO still has to give the right answer.  RobR
thought maybe MPI_Allreduce or something along those lines before the
MPI_Alltoall would let you check.

Your other hints make a lot of intuitive sense to me.  Is this one a
big win, though?  If MPI_Alltoall is giving you a big headache, then
maybe there is a more fundamental problem with the MPI implementation?

Thanks
==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA                 B29D F333 664A 4280 315B



More information about the lustre-discuss mailing list