[lustre-devel] [PATCH 48/50] lnet: Stop discovery on deleted peer NI

James Simmons jsimmons at infradead.org
Sun Mar 20 06:31:02 PDT 2022


From: Chris Horn <chris.horn at hpe.com>

lnet_discover_peer_locked() needs to check whether the peer NI that is
undergoing discovery has been deleted (i.e. its assocaited peer has
LNET_PEER_MARK_DELETED state). Otherwise, we may enter an infinite
loop because this peer will never be considered up to date.

Fixes: 4f69acf8aa ("lnet: Prevent discovery on deleted peer")
WC-bug-id: https://jira.whamcloud.com/browse/LU-15512
Lustre-commit: 94f4e1f517d71ffd6 ("LU-15512 lnet: Stop discovery on deleted peer NI")
Signed-off-by: Chris Horn <chris.horn at hpe.com>
Reviewed-on: https://review.whamcloud.com/46429
Reviewed-by: Serguei Smirnov <ssmirnov at whamcloud.com>
Reviewed-by: James Simmons <jsimmons at infradead.org>
Reviewed-by: Oleg Drokin <green at whamcloud.com>
Signed-off-by: James Simmons <jsimmons at infradead.org>
---
 net/lnet/lnet/peer.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c
index 16a694c..98f71dd 100644
--- a/net/lnet/lnet/peer.c
+++ b/net/lnet/lnet/peer.c
@@ -2578,6 +2578,8 @@ static void lnet_peer_clear_discovery_error(struct lnet_peer *lp)
 			break;
 		if (lnet_peer_is_uptodate(lp))
 			break;
+		if (lp->lp_state & LNET_PEER_MARK_DELETED)
+			break;
 		lnet_peer_queue_for_discovery(lp);
 		count++;
 		CDEBUG(D_NET, "Discovery attempt # %d\n", count);
@@ -2620,7 +2622,9 @@ static void lnet_peer_clear_discovery_error(struct lnet_peer *lp)
 		rc = lp->lp_dc_error;
 	else if (!block)
 		CDEBUG(D_NET, "non-blocking discovery\n");
-	else if (!lnet_peer_is_uptodate(lp) && !lnet_is_discovery_disabled(lp))
+	else if (!lnet_peer_is_uptodate(lp) &&
+		 !(lnet_is_discovery_disabled(lp) ||
+		   (lp->lp_state & LNET_PEER_MARK_DELETED)))
 		goto again;
 
 	CDEBUG(D_NET, "peer %s NID %s: %d. %s\n",
-- 
1.8.3.1



More information about the lustre-devel mailing list