[Lustre-discuss] [ROMIO Req #940] [Fwd: Re: [Lustre-devel] a new Lustre ADIO driver]
Robert Latham
robl at mcs.anl.gov
Wed Mar 18 12:36:29 PDT 2009
On Mon, Mar 16, 2009 at 03:41:47PM +0800, emoly.liu wrote:
> Robert Latham wrote:
>> Second favor: documentation... Can you send me a brief summary of
>> the new hints?
> Sure
Thanks for the documentation. These explanations are good, but now
we've found a few problems. The naming issues are rather minor, but
some of your hints aren't compliant with the MPI-IO spec,
unfortunately.
>> romio_lustre_CO
>>
> In stripe-contiguous IO pattern, each OST will be accessed by a group of
> IO clients. CO means *C*lient/*O*ST ratio, the max. number of IO clients
> for each OST.
> CO=1 by default.
To make it more clear, how about calling it "romio_lustre_co_ratio" ?
>> romio_lustre_bigsize
>>
> We won't do collective I/O if this hint is set and the IO request size
> is bigger than this value. That's because when the request size is big,
> the collective communication overhead increases and the benefits from
> collective I/O becomes limited.
Instead of 'bigzise' how about "romio_lustre_coll_highwater" or
"romio_lustre_coll_threshold"?
>> romio_lustre_contig_data
>> romio_lustre_samesize
>>
> They are two hints to tell the driver whether the request data are
> contiguous and whether each request IO has the same size. If they
> are both "yes", we can optimize ADIOI_LUSTRE_Calc_others_req() by
> removing MPI_Alltoall(). Because each process can easily calculate
> the pairs of offset and length for each request without collective
> communication. BTW, currently only when they are both positive, the
> optimization can work. In the future, probably some efforts will be
> made to other conditions.
OK, here's the one with the major problem. RobR reminds me that
MPI-IO requires hints to be optional and cannot cause incorrect
behavior. A user supplying these hints and then giving you data that
is noncontiguous or not of the same size would cause incorrect
behavior, so these aren't appropriate.
Is there a way you can check what the caller is doing? caller can lie
to you via hints, but ROMIO still has to give the right answer. RobR
thought maybe MPI_Allreduce or something along those lines before the
MPI_Alltoall would let you check.
Your other hints make a lot of intuitive sense to me. Is this one a
big win, though? If MPI_Alltoall is giving you a big headache, then
maybe there is a more fundamental problem with the MPI implementation?
Thanks
==rob
--
Rob Latham
Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA B29D F333 664A 4280 315B
More information about the lustre-discuss
mailing list