[Lustre-discuss] [ROMIO Req #940] [Fwd: Re: [Lustre-devel] a new Lustre ADIO driver]
Rob Ross
rross at mcs.anl.gov
Thu Mar 19 00:07:14 PDT 2009
Hi LiuYing,
Unfortunately, the group here is committed to our interpretation of
the standard as being that the user passing a hint parameter that is
misleading to the implementation cannot cause *incorrect* behavior
(i.e. change the semantics of the call).
An option for determining contiguity is to pass messages during
file_set_view time; if the file view is contiguous, then the access is
contiguous. Since file_set_view is a collective call, you have an
opportunity to do this message passing.
I'm not sure how you're avoiding any communication, because the
application processes can still be performing I/O at arbitrary
offsets. Perhaps knowing that the access in file is contiguous,
however, can be used to reduce the overall communication at I/O time
anyway? Can you further explain how these hints worked? Maybe we can
come up with an alternative together.
Thanks,
Rob
On Mar 19, 2009, at 1:27 AM, emoly.liu wrote:
> Hi rob,
>
> Robert Latham wrote:
>>
>> On Mon, Mar 16, 2009 at 03:41:47PM +0800, emoly.liu wrote:
>>>> romio_lustre_contig_data
>>>> romio_lustre_samesize
>>> They are two hints to tell the driver whether the request data are
>>> contiguous and whether each request IO has the same size. If they
>>> are both "yes", we can optimize ADIOI_LUSTRE_Calc_others_req() by
>>> removing MPI_Alltoall(). Because each process can easily calculate
>>> the pairs of offset and length for each request without collective
>>> communication. BTW, currently only when they are both positive, the
>>> optimization can work. In the future, probably some efforts will be
>>> made to other conditions.
>>>
>> OK, here's the one with the major problem. RobR reminds me that
>> MPI-IO requires hints to be optional and cannot cause incorrect
>> behavior. A user supplying these hints and then giving you data that
>> is noncontiguous or not of the same size would cause incorrect
>> behavior, so these aren't appropriate.
>>
>> Is there a way you can check what the caller is doing? caller can
>> lie
>> to you via hints, but ROMIO still has to give the right answer. RobR
>> thought maybe MPI_Allreduce or something along those lines before the
>> MPI_Alltoall would let you check.
>>
> Hmm, it is indeed a problem, although we did get benefits from them
> in our previous tests.
>
> I will check it. But currently, is it possible to make mention of
> the risk with some words, just like "Don't set these two hints,
> until you know exactly what you are doing" ?
> If it is still inappropriate, I will remove them in this version,
> then submit another patch once I figure out how to check it with low
> overhead.
More information about the lustre-discuss
mailing list