[Lustre-discuss] Lustre, locking, and fsync

Robert Olson olson at mcs.anl.gov
Mon Feb 8 13:45:41 PST 2010


I have a locking/fsync question..

I have an app that keeps job metadata in an XML file that resides on a  
Lustre filesystem (I actually just discovered my running system has it  
on NFS, but I'm seeing an anomaly on Lustre so I'll keep writing).

It uses libxml to read and write the file, and thus has to read the  
file into memory, make changes, and write back out.

The approach I'm taking to this is:

	open file => fd
	lock fd (using fcntl F_SETLKW)
	read from fd
	ftruncate fd
	<make modifications>
	fsync fd
	unlock fd
	close fd

The lustre system is a 4-OSS system and I'm running the test across 12  
compute nodes, all of which have the fs mounted with the flock option  
(it falls over immediately without flock). I'm at lustre 1.6.6.

What I'm seeing is that, occasionally, the file reads will pick up an  
empty or partial file. This doesn't seem like it should be the case,  
but I'm sure I'm missing something. I don't see any errors showing up  
on the MDS.

thanks,
--bob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100208/b85663bb/attachment.htm>


More information about the lustre-discuss mailing list