[lustre-discuss] flock timout?

Mon Apr 13 08:13:22 PDT 2015

I have a small isolated cluster (rhel 6.6) and lustre filesystem
(v2.4.3), all are running over ipoib.  Currently I have flock turned
on across all nodes.  I'm seeing an issue where the work load i have
running sometimes outputs zero length files instead of data.
re-running the job corrects the data, so i'm pretty sure it's not code
related.

my question is, is there some kind of timeout and error from flock
that lustre will kick back to my code that i could detect?  and if so,
is there a way of changing the timeout delay?  are there any other
counters somewhere in lustre that would show me if i'm having a large
number of flock timeouts?