[lustre-devel] [PATCH 106/151] lnet: ensure peer put back on dc request queue
James Simmons
jsimmons at infradead.org
Mon Sep 30 11:56:05 PDT 2019
From: Bruno Faccini <bruno.faccini at intel.com>
Upon async PUT request received from peer already in discovery
process, lnet_peer_push_event() was not handling the case where
peer could be on working/ln_dc_working queue. This could lead
for peer not to be re-dsicovered as expected, but left on
working queue and to be finally timed-out.
Also ensure that peer will not be put back on request queue by
event handler if discovery is already completed.
WC-bug-id: https://jira.whamcloud.com/browse/LU-10123
Lustre-commit: d0185dd43394 ("LU-10123 lnet: ensure peer put back on dc request queue")
Signed-off-by: Bruno Faccini <bruno.faccini at intel.com>
Reviewed-on: https://review.whamcloud.com/30147
Reviewed-by: Amir Shehata <ashehata at whamcloud.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin at intel.com>
Reviewed-by: Olaf Weber <olaf.weber at hpe.com>
Reviewed-by: Doug Oucharek <dougso at me.com>
Reviewed-by: Oleg Drokin <green at whamcloud.com>
Signed-off-by: James Simmons <jsimmons at infradead.org>
---
net/lnet/lnet/peer.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c
index 52d4ec0..e2f8c28 100644
--- a/net/lnet/lnet/peer.c
+++ b/net/lnet/lnet/peer.c
@@ -1983,13 +1983,16 @@ void lnet_peer_push_event(struct lnet_event *ev)
out:
/*
- * Queue the peer for discovery, and wake the discovery thread
- * if the peer was already queued, because its status changed.
+ * Queue the peer for discovery if not done, force it on the request
+ * queue and wake the discovery thread if the peer was already queued,
+ * because its status changed.
*/
spin_unlock(&lp->lp_lock);
lnet_net_lock(LNET_LOCK_EX);
- if (lnet_peer_queue_for_discovery(lp))
+ if (!lnet_peer_is_uptodate(lp) && lnet_peer_queue_for_discovery(lp)) {
+ list_move(&lp->lp_dc_list, &the_lnet.ln_dc_request);
wake_up(&the_lnet.ln_dc_waitq);
+ }
/* Drop refcount from lookup */
lnet_peer_decref_locked(lp);
lnet_net_unlock(LNET_LOCK_EX);
@@ -2348,7 +2351,11 @@ static void lnet_discovery_event_handler(struct lnet_event *event)
lnet_ping_buffer_decref(pbuf);
lnet_peer_decref_locked(lp);
}
- if (rc == LNET_REDISCOVER_PEER) {
+
+ /* Put peer back at end of request queue, if discovery not already
+ * done
+ */
+ if (rc == LNET_REDISCOVER_PEER && !lnet_peer_is_uptodate(lp)) {
list_move_tail(&lp->lp_dc_list, &the_lnet.ln_dc_request);
wake_up(&the_lnet.ln_dc_waitq);
}
--
1.8.3.1
More information about the lustre-devel
mailing list