[lustre-discuss] question on usage of O_LOV_DELAY_CREATE
Bertschinger, Thomas Andrew Hjorth
bertschinger at lanl.gov
Tue Jul 30 07:40:45 PDT 2024
Hello,
We have an application that fails doing the following on one of our systems:
...
openat(AT_FDCWD, "mpi_test.out", O_WRONLY|O_CREAT|O_NOCTTY|FASYNC, 0611) = 4
pwrite64(4, "\3\0\0\0", 4, 0) = -1 EBADF (Bad file descriptor)
...
It opens a file with O_LOV_DELAY_CREATE (or O_NOCTTY|FASYNC as strace interprets it), and then immediately tries to write to it.
>From the comments above ll_file_open() in Lustre:
> If opened with O_LOV_DELAY_CREATE, then we don't do the object creation or open until ll_lov_setstripe() ioctl is called.
It sounds like the expectation is that the process calling open() like this follows it up with an ioctl to set the stripe information prior to writing.
Is this correct? In other words, is it reasonable to say that the failing code is doing something erroneous?
Here's a minimal MPI program that reproduces the problem. The issue only arises when using the Cray MPI implementation, however. When tested with openmpi and ANL mpich, the openat() call doesn't use O_LOV_DELAY_CREATE. Since the Cray implementation is unfortunately not open source, I have no insight into what this code is "supposed" to be doing. :(
#include <stdio.h>
#include <mpi.h>
int main(int argc, char *argv[])
{
int err = MPI_Init(&argc, &argv);
MPI_File fh;
err = MPI_File_open(MPI_COMM_WORLD, "mpi_test.out",
MPI_MODE_WRONLY|MPI_MODE_CREATE, MPI_INFO_NULL, &fh);
printf("MPI_File_open returned: %d\n", err);
long data = 3;
err = MPI_File_write(fh, &data, 1, MPI_LONG, MPI_STATUS_IGNORE);
printf("MPI_File_write returned: %d\n", err);
err = MPI_File_close(&fh);
printf("MPI_File_close returned: %d\n", err);
MPI_Finalize();
return 0;
}
Thanks,
Thomas Bertschinger
More information about the lustre-discuss
mailing list