[Lustre-discuss] [ROMIO Req #940] a new Lustre ADIO driver]

David Knaak knaak at cray.com
Tue May 12 00:08:31 PDT 2009


On Mon, May 11, 2009 at 09:28:12AM -0500, Rob Latham wrote:
> On Sat, May 09, 2009 at 03:53:40PM -0500, Weikuan Yu wrote:
> > Yes, it passed compilation. But there are many errors reported from runtest,
> > quite a number of them are only from ad_lustre driver. Attached is an output
> > tarball (named as .txt though). It contains the output files from running
> > romio/runtests with ad_lustre and ad_ufs drivers separately.
> 
> Thanks for sending both the UFS and Lustre tests.  The atomicity,
> i_noncontig, noncontig, shared_fp, and ordered_fp failures look like fcntl
> locks just don't work on the cray? Both UFS and Lustre say
> "ADIOI_Set_lock:: Function not implemented".  
> 
> That's going to cause some problems not just for atomic mode, but also
> for data sieving writes, so I guess it's good the failure is a loud
> MPI_ABORT and not a quiet corruption of data

Weikuan,

To get file locking to work properly for Cray XT, the Lustre file
system must be mounted with the "flock" option the compute nodes.  This
is the recommended option for all Cray systems.

Did you build and run these tests on a Cray XT system?  If so, which one?

> So, the real challenges are coll_test, noncontig_coll, hindexed,
> aggregation1, aggregation2, split_coll... basically, collective I/O is
> messed up. 
> 
> hindexed does a collective write followed by an independent read and
> that's failing, so we should explore the collective write path first
> and make sure that's working. 

Rob,

I'm looking at the collective buffering errors.

David



More information about the lustre-discuss mailing list