[lustre-devel] [PATCH 05/15] lnet: Ensure ref taken when queueing for discovery

James Simmons jsimmons at infradead.org
Wed Jul 7 12:11:06 PDT 2021


From: Chris Horn <chris.horn at hpe.com>

Call lnet_peer_queue_for_discovery() in
lnet_discovery_event_handler() to ensure that we take a ref on
the peer when forcing it onto the discovery queue. This also ensures
that the peer state has LNET_PEER_DISCOVERING.

Add a test to sanity-lnet.sh that can trigger the refcount loss bug
in discovery.

HPE-bug-id: LUS-7651
WC-bug-id: https://jira.whamcloud.com/browse/LU-14627
Lustre-commit: 2ce6957b69370b0c ("LU-14627 lnet: Ensure ref taken when queueing for discovery")
Signed-off-by: Chris Horn <chris.horn at hpe.com>
Reviewed-on: https://review.whamcloud.com/43418
Reviewed-by: Serguei Smirnov <ssmirnov at whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko at hpe.com>
Reviewed-by: James Simmons <jsimmons at infradead.org>
Reviewed-by: Stephane Thiell <sthiell at stanford.edu>
Reviewed-by: Oleg Drokin <green at whamcloud.com>
Signed-off-by: James Simmons <jsimmons at infradead.org>
---
 net/lnet/lnet/peer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c
index 76b2d2f..29c3372 100644
--- a/net/lnet/lnet/peer.c
+++ b/net/lnet/lnet/peer.c
@@ -2783,7 +2783,8 @@ static void lnet_discovery_event_handler(struct lnet_event *event)
 	/* Put peer back at end of request queue, if discovery not already
 	 * done
 	 */
-	if (rc == LNET_REDISCOVER_PEER && !lnet_peer_is_uptodate(lp)) {
+	if (rc == LNET_REDISCOVER_PEER && !lnet_peer_is_uptodate(lp) &&
+	    lnet_peer_queue_for_discovery(lp)) {
 		list_move_tail(&lp->lp_dc_list, &the_lnet.ln_dc_request);
 		wake_up(&the_lnet.ln_dc_waitq);
 	}
-- 
1.8.3.1



More information about the lustre-devel mailing list