[Lustre-discuss] Performances and fsync()

pascal.deveze at bull.net pascal.deveze at bull.net
Fri Dec 4 02:24:31 PST 2009


Hi,

I am using b_eff_io to measure performance of ROMIO over Lustre version
1.6.7.1.
I am using the new ADIO Lustre Driver and saw that performances are very
low.
The reason of that is because the write bandwidth is calculated after a
call to
fsync().

After investigations, I saw that even when the file is empty, the
fsync takes 10 ms. If there are more than one process, the fsync calls
seems to be serialized.
The time is 80 ms for 8 processes :

salloc -n 8 -N 1 mpirun time-fsync -f /mnt/romio/FILE
filename=/mnt/romio/FILE
First sync (proc 0): 0.005534
03: sync : 0.019168 (err=0)
07: sync : 0.028794 (err=0)
01: sync : 0.038586 (err=0)
05: sync : 0.048467 (err=0)
02: sync : 0.058380 (err=0)
00: sync : 0.068205 (err=0)
04: sync : 0.078027 (err=0)
06: sync : 0.087960 (err=0)

The same programm on an NFS file gives less than 5 microseconds for
the same fsync() calls on 8 processes:

salloc -n 8 -N 1 mpirun time-fsync -f FILE
filename=FILE
First sync (proc 0): 0.000004
06: sync : 0.000004 (err=0)
04: sync : 0.000004 (err=0)
00: sync : 0.000002 (err=0)
03: sync : 0.000003 (err=0)
02: sync : 0.000004 (err=0)
01: sync : 0.000004 (err=0)
05: sync : 0.000004 (err=0)
07: sync : 0.000004 (err=0)

1) Is this behaviour normal for Lustre ?

2) Is is possible to configure something to make this fsync() run better ?


================ source of time_fsync.c
=====================================
#include "mpi.h"
#include <string.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

double t1;
char *opt_filename;
int mynod, fd, err;

int main(int argc, char **argv){
 char ch;

 MPI_Init(&argc,&argv);
 MPI_Comm_rank(MPI_COMM_WORLD, &mynod);

 while ((ch = getopt( argc, argv, "f:" )) != EOF) {
    switch(ch) {
       case 'f':
          opt_filename = strdup(optarg);
          if (!mynod) printf("filename=%s \n", opt_filename);
          break;
    }
 }

 // Proc 0 opens/create the file
 if (!mynod) {
   fd = open(opt_filename, O_RDWR | O_CREAT, 0666);
   t1 = MPI_Wtime();
   fsync(fd);
   printf("First sync (proc 0): %.6f\n", MPI_Wtime()-t1);
   close(fd);
 }
 MPI_Barrier(MPI_COMM_WORLD);
 fd = open(opt_filename, O_RDWR);
 MPI_Barrier(MPI_COMM_WORLD);
 t1 = MPI_Wtime();
 err=fsync(fd);
 printf("%.2d: sync : %.6f (err=%d)\n", mynod, MPI_Wtime()-t1, err);

 close(fd);
 MPI_Finalize();
 return 0;
}





More information about the lustre-discuss mailing list