[Lustre-discuss] Lustre, locking, and fsync
Robert Olson
olson at mcs.anl.gov
Mon Feb 8 13:45:41 PST 2010
I have a locking/fsync question..
I have an app that keeps job metadata in an XML file that resides on a
Lustre filesystem (I actually just discovered my running system has it
on NFS, but I'm seeing an anomaly on Lustre so I'll keep writing).
It uses libxml to read and write the file, and thus has to read the
file into memory, make changes, and write back out.
The approach I'm taking to this is:
open file => fd
lock fd (using fcntl F_SETLKW)
read from fd
ftruncate fd
<make modifications>
fsync fd
unlock fd
close fd
The lustre system is a 4-OSS system and I'm running the test across 12
compute nodes, all of which have the fs mounted with the flock option
(it falls over immediately without flock). I'm at lustre 1.6.6.
What I'm seeing is that, occasionally, the file reads will pick up an
empty or partial file. This doesn't seem like it should be the case,
but I'm sure I'm missing something. I don't see any errors showing up
on the MDS.
thanks,
--bob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100208/b85663bb/attachment.htm>
More information about the lustre-discuss
mailing list