[Lustre-discuss] [ROMIO Req #940] [Fwd: Re: [Lustre-devel] a new Lustre ADIO driver]

Rob Ross rross at mcs.anl.gov
Thu Mar 19 00:07:14 PDT 2009


Hi LiuYing,

Unfortunately, the group here is committed to our interpretation of  
the standard as being that the user passing a hint parameter that is  
misleading to the implementation cannot cause *incorrect* behavior  
(i.e. change the semantics of the call).

An option for determining contiguity is to pass messages during  
file_set_view time; if the file view is contiguous, then the access is  
contiguous. Since file_set_view is a collective call, you have an  
opportunity to do this message passing.

I'm not sure how you're avoiding any communication, because the  
application processes can still be performing I/O at arbitrary  
offsets. Perhaps knowing that the access in file is contiguous,  
however, can be used to reduce the overall communication at I/O time  
anyway? Can you further explain how these hints worked? Maybe we can  
come up with an alternative together.

Thanks,

Rob

On Mar 19, 2009, at 1:27 AM, emoly.liu wrote:

> Hi rob,
>
> Robert Latham wrote:
>>
>> On Mon, Mar 16, 2009 at 03:41:47PM +0800, emoly.liu wrote:
>>>> romio_lustre_contig_data
>>>> romio_lustre_samesize
>>> They are two hints to tell the driver whether the request data are
>>> contiguous and whether each request IO has the same size.  If they
>>> are both "yes", we can optimize ADIOI_LUSTRE_Calc_others_req()  by
>>> removing MPI_Alltoall(). Because each process can easily calculate
>>> the pairs of offset and length for each request without collective
>>> communication.  BTW, currently only when they are both positive, the
>>> optimization can  work. In the future, probably some efforts will be
>>> made to other  conditions.
>>>
>> OK, here's the one with the major problem.  RobR reminds me that
>> MPI-IO requires hints to be optional and cannot cause incorrect
>> behavior.  A user supplying these hints and then giving you data that
>> is noncontiguous or not of the same size would cause incorrect
>> behavior, so these aren't appropriate.
>>
>> Is there a way you can check what the caller is doing?  caller can  
>> lie
>> to you via hints, but ROMIO still has to give the right answer.  RobR
>> thought maybe MPI_Allreduce or something along those lines before the
>> MPI_Alltoall would let you check.
>>
> Hmm, it is indeed a problem, although we did get benefits from them  
> in our previous tests.
>
> I will check it. But currently, is it possible to make mention of  
> the risk with some words, just like "Don't set these two hints,  
> until you know exactly what you are doing" ?
> If it is still inappropriate, I will remove them in this version,  
> then submit another patch once I figure out how to check it with low  
> overhead.





More information about the lustre-discuss mailing list