[Lustre-discuss] [ROMIO Req #940] a new Lustre ADIO driver]

David Knaak knaak at cray.com
Tue May 12 02:10:44 PDT 2009


> On Mon, May 11, 2009 at 09:28:12AM -0500, Rob Latham wrote:
> > So, the real challenges are coll_test, noncontig_coll, hindexed,
> > aggregation1, aggregation2, split_coll... basically, collective I/O is
> > messed up. 
> > 
> > hindexed does a collective write followed by an independent read and
> > that's failing, so we should explore the collective write path first
> > and make sure that's working. 

On Tue, May 12, 2009 at 02:08:31AM -0500, David Knaak wrote:
> I'm looking at the collective buffering errors.

Rob,

With my version of the Lustre stripe-aligned collective buffering 
(which merges some of LiuYing's code with mine):

  coll_test passes (2 PEs as required)
  noncontig_coll passes (2 PEs as required)
  hindexed passes (4 PEs as required)
  aggregation1 passes (up to 60 PEs, that's as high as I'm going tonight)
  aggregation2 passes (up to 60 PEs, that's as high as I'm going tonight)
  split_coll passes (up to 60 PEs, that's as high as I'm going tonight)

The one collective buffering test that fails is noncontig_coll2.  I'll
look at that more closely.

One other test that fails is shared_fp (60 PEs) but it also fails with
collective buffering disabled, and besides, it doesn't make any
collective I/O calls.  I haven't looked closely yet at the test.

As I said last week, I have not yet had time to build the full
Lustre ADIO and therefore can't at this point make any statement about
it.  

David
  



More information about the lustre-discuss mailing list