[Lustre-discuss] [ROMIO Req #940] [Fwd: Re: [Lustre-devel] a new Lustre ADIO driver]

emoly.liu Emoly.Liu at Sun.COM
Fri Apr 24 01:39:08 PDT 2009


Hi Rob,

Here is the new patch for lustre adio driver, based on MPICH2-1.0.7.

Per our discussion, I did the following changes:

1) rename the hints
    - romio_lustre_CO -> romio_lustre_co_ratio
    - romio_lustre_bigsize -> romio_lustre_coll_threshold

2) remove the two confusing hints
I removed "contig_data" and "samesize", then use ADIOI_Calc_others_req() 
instead of ADIOI_LUSTRE_Calc_others_req().

I have tested the patch in a small scale environment.

Please check and let me know if you have any questions.

Thanks,
LiuYing

Robert Latham wrote:
> On Mon, Mar 16, 2009 at 03:41:47PM +0800, emoly.liu wrote:
>   
> Thanks for the documentation.  These explanations are good, but now
> we've found a few problems.  The naming issues are rather minor, but
> some of your hints aren't compliant with the MPI-IO spec,
> unfortunately.
>
>   
>>> romio_lustre_CO
>>>   
>>>       
>> In stripe-contiguous IO pattern, each OST will be accessed by a group of  
>> IO clients. CO means *C*lient/*O*ST ratio, the max. number of IO clients  
>> for each OST.
>> CO=1 by default.
>>     
>
> To make it more clear, how about calling it "romio_lustre_co_ratio" ?
>
>   
>>> romio_lustre_bigsize
>>>   
>>>       
>> We won't do collective I/O if this hint is set and the IO request size  
>> is bigger than this value. That's because when the request size is big,  
>> the collective communication overhead increases and the benefits from  
>> collective I/O becomes limited.
>>     
>
> Instead of 'bigzise' how about "romio_lustre_coll_highwater" or
> "romio_lustre_coll_threshold"?
>
>   
>>> romio_lustre_contig_data
>>> romio_lustre_samesize
>>>   
>>>       
>> They are two hints to tell the driver whether the request data are
>> contiguous and whether each request IO has the same size.  If they
>> are both "yes", we can optimize ADIOI_LUSTRE_Calc_others_req()  by
>> removing MPI_Alltoall(). Because each process can easily calculate
>> the pairs of offset and length for each request without collective
>> communication.  BTW, currently only when they are both positive, the
>> optimization can  work. In the future, probably some efforts will be
>> made to other  conditions.
>>     
>
> OK, here's the one with the major problem.  RobR reminds me that
> MPI-IO requires hints to be optional and cannot cause incorrect
> behavior.  A user supplying these hints and then giving you data that
> is noncontiguous or not of the same size would cause incorrect
> behavior, so these aren't appropriate.
>
> Is there a way you can check what the caller is doing?  caller can lie
> to you via hints, but ROMIO still has to give the right answer.  RobR
> thought maybe MPI_Allreduce or something along those lines before the
> MPI_Alltoall would let you check.
>
> Your other hints make a lot of intuitive sense to me.  Is this one a
> big win, though?  If MPI_Alltoall is giving you a big headache, then
> maybe there is a more fundamental problem with the MPI implementation?
>
> Thanks
> ==rob
>
>   


-- 
Best regards,

LiuYing
System Software Engineer, Lustre Group
Sun Microsystems ( China ) Co. Limited

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20090424/83742654/attachment.htm>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: adio_driver_mpich2-1.0.7_v7.patch
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20090424/83742654/attachment.txt>


More information about the lustre-discuss mailing list