[Lustre-discuss] clients gets EINTR from time to time

John Hammond jhammond at tacc.utexas.edu
Fri Feb 25 12:16:36 PST 2011


On 02/25/2011 11:39 AM, Andreas Dilger wrote:
> On 2011-02-25, at 6:28, "Brian J. Murrell" <brian at whamcloud.com> wrote:
>> On 11-02-25 06:18 AM, Francois  wrote:
>>>
>>> I continue to parse debug logs and keep them posted.
>>
>> I don't understand why you don't just fix your application to handle a
>> perfectly valid and expected condition (that it's currently not
>> handling) instead of wasting time trying to find the cause of the
>> expected condition.  Even if you find it, it's likely not a bug and not
>> something that can/will be fixed.  It's your application that needs to
>> be fixed.
> 
> In all fairness Brian, it isn't always possible to fix an application like you suggest. It might be commercial (binary only), it might be complex code using 3rd party libraries to do the IO that would lose support if modifed, etc. 
> 
> I think the first action to debug this is to run on the client with "lctl set_param debug=+trace" or "=~0" which will enable function entry/exit tracing in Lustre. Then when the problem us hit run "lctl dk /tmp/debug" to dump the Lustre debug log, and search for -4 (which is -EINTR) to see where this error is first appearing. 
> 
> At that point we can make a determination where the source of the error is, and if it is Lustre's fault. I know at one time there was a related problem in the l_wait_event() macro that was improperly masking signals, but I thought it was fixed by 1.8.5. 

Setting aside the moral question of which calls should be interruptible,
I think that the handling of the LUSTRE_FATAL_SIGS (defined in
lustre_lib.h to be SIGKILL, SIGINT, SIGTERM, SIGQUIT, SIGALRM) is
slightly broken.  Under certain situations, Lustre will return -EINTR
although no signals were delivered.  That's probably not the end of the
world for most applications, but OTOH I don't think anybody assumes that
-EINTR will be delivered spuriously.

Consider the following sequence:

1) Process P has a Lustre file F open.

2) P has SIGALRM pending (but blocked).

3) P starts to writing to F and ends up sleeping in (something like):

  sys_write()
   ...
    ll_extent_lock()
     ...
      osc_enqueue()
       ...
        ptlrpc_queue_wait().

4) The OST does not respond to the request before the deadline, so
l_wait_event() replaces the signal mask of P with the LUSTRE_FATAL_SIGS,
notices that SIGALRM is now deliverable, restores the signal mask of P,
and ptlrpc_queue_wait() returns -EINTR.

5) P is exiting from sys_write(), SIGALRM is blocked (but still pending)
so it doesn't get delivered.

6) P spuriously returns -EINTR from sys_write().

I can reproduce this on 1.8.5/RHEL 5.5.  If the goal is to emulate NFS's
interruptibility during congestion then returning -ERESTARTSYS would be
more appropriate.  Also, it might be worthwhile to make this extra
interruptibility a mount flag, as NFS does.

Best,

John

-- 
John L. Hammond, Ph.D.
TACC, The University of Texas at Austin
jhammond at tacc.utexas.edu



More information about the lustre-discuss mailing list