<html dir="ltr">

<head>

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

<style type="text/css" id="owaParaStyle"></style>

</head>

<body fpstyle="1" ocsi="0">

<div style="direction: ltr;font-family: Tahoma;color: #000000;font-size: 10pt;">

<div>Following on from my previous message.</div>

<div><br>

</div>

<div>In ldlm_lock_match, interestingly, the other threads do not (initially)</div>

<div>wait for the matched lock to be granted; instead, they first wait for</div>

<div>the LVB_READY flag to be set.  This flag is set only after a lock is granted,</div>

<div>so it's used as a proxy for the granted/waiting state of a lock. </div>

<div><br>

</div>

<div>However, getting this set correctly for async lock requests is a problem.</div>

<div>LDLM_FL_LVB_READY is only set (for extent locks) by osc_lock_lvb_update,</div>

<div>which is called from osc_lock_upcall/osc_lock_upcall_speculative (either directly</div>

<div>or via osc_lock_granted, but still from the upcall).</div>

<div><br>

</div>

<div>The problem that's happening is this:</div>

<div>The reply is received, putting the lock on the waiting list.</div>

<div>The lvb is filled in ldlm_cli_enqueue_fini, but when the upcall is called,</div>

<div>the lock is not granted, so osc_lock_lvb_update is not called,</div>

<div>and LDLM_FL_LVB_READY is not set.</div>

<div><br>

</div>

<div>This is a normal sequence of events for both synchronous and async lock requests.</div>

<div>However, for synchronous lock requests, the original enqueueing thread sleeps</div>

<div>(ldlm_cli_enqueue_fini-->l_completion_ast) waiting for the lock to be granted.</div>

<div>Then, once the lock is granted by a CP_CALLBACK (which fills the LVB again with updated data),</div>

<div>the original enqueueing thread wakes up and returns up to osc_enqueue_base,</div>

<div>which calls osc_enqueue_fini, which calls the upcall.</div>

<div>Now the lock is granted, so osc_lock_lvb_update is called & LDLM_FL_LVB_READY is set.</div>

<div><br>

</div>

<div>For asynchronous lock requests, no one is waiting.  So ldlm_handle_cp_callback fills</div>

<div>the LVB, then grants the lock, then is done.  <span style="font-size: 10pt;">And so, for async locks, osc_lock_lvb_update</span></div>

<div><span style="font-size: 10pt;">is not called, and LDLM_FL_LVB_READY </span><span style="font-size: 10pt;">is not set.</span></div>

<div><br>

</div>

<div>To recap the sequence of events required:</div>

<div>1. Async lock request sent</div>

<div>2. Reply is received, lock is not granted (upcall is called, but</div>

<div>osc_lock_lvb_update cannot happen because the lock is not granted)</div>

<div>[Normally, at this point, a synchronous lock request would sleep waiting for</div>

<div>the lock to be granted]</div>

<div>3. CP_CALLBACK is received, granting the lock.  LVB is is filled. </div>

<div>--> osc_lock_lvb_update is never called & LDLM_FL_LVB_READY is never set.</div>

<div><br>

</div>

<div>I thought it might be possible to call osc_lock_lvb_update in the upcall even</div>

<div>when the lock is not granted, but the LVB is updated on a CP_CALLBACK, so we'd</div>

<div>fail to update with that newer information.  Presumably that's not OK.</div>

<div>(also, ldlm_lock_match checks LVB_READY before checking if the lock is granted,</div>

<div>so that would have to change too..  But that's fairly simple.)</div>

<div><br>

</div>

<div>I've been struggling to come up with a solution to this one.</div>

<div>Any thoughts?</div>

<div><br>

</div>

<div>The one thought I have is calling osc_lock_lvb_update in the CP callback handler,</div>

<div>but that feels like a layering violation.  We'd also need some method to ensure</div>

<div>we didn't call osc_lock_lvb_update more than once, but that could probably be done</div>

<div>by checking the LDLM_FL_LVB_READY flag...?</div>

<div><br>

</div>

<div>- Patrick</div>

</div>

</body>

</html>