[lustre-devel] [PATCH 10/14] lnet: Skip discovery in LNetPrimaryNID if DD disabled

James Simmons jsimmons at infradead.org
Mon May 3 17:10:12 PDT 2021


From: Chris Horn <chris.horn at hpe.com>

If discovery is disabled locally then the discovery thread will not
modify any peer objects as a result of the discovery process. Thus,
the primary NID of any peer we're asked to discover will not change
as a result of discovery. Therefore, we do not need to actually
perform discovery in LNetPrimaryNID() if discovery is disabled
locally. Since this routine can result in long client mount times
when a Lustre server is down we should avoid this unnecessary
discovery.

HPE-bug-id: LUS-9887
WC-bug-id: https://jira.whamcloud.com/browse/LU-14566
Lustre-commit: 16264da9e3c43a63 ("LU-14566 lnet: Skip discovery in LNetPrimaryNID if DD disabled")
Signed-off-by: Chris Horn <chris.horn at hpe.com>
Reviewed-on: https://review.whamcloud.com/43141
Reviewed-by: Serguei Smirnov <ssmirnov at whamcloud.com>
Reviewed-by: James Simmons <jsimmons at infradead.org>
Reviewed-by: Oleg Drokin <green at whamcloud.com>
Signed-off-by: James Simmons <jsimmons at infradead.org>
---
 net/lnet/lnet/peer.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c
index db00514..d66a302 100644
--- a/net/lnet/lnet/peer.c
+++ b/net/lnet/lnet/peer.c
@@ -1336,7 +1336,13 @@ struct lnet_peer_ni *
 	}
 	lp = lpni->lpni_peer_net->lpn_peer;
 
-	while (!lnet_peer_is_uptodate(lp)) {
+	/* If discovery is disabled locally then we needn't bother running
+	 * discovery here because discovery will not modify whatever
+	 * primary NID is currently set for this peer. If the specified peer is
+	 * down then this discovery can introduce long delays into the mount
+	 * process, so skip it if it isn't necessary.
+	 */
+	while (!lnet_peer_discovery_disabled && !lnet_peer_is_uptodate(lp)) {
 		spin_lock(&lp->lp_lock);
 		/* force a full discovery cycle */
 		lp->lp_state |= LNET_PEER_FORCE_PING | LNET_PEER_FORCE_PUSH;
@@ -1357,7 +1363,11 @@ struct lnet_peer_ni *
 		}
 		lp = lpni->lpni_peer_net->lpn_peer;
 
-		/* Only try once if discovery is disabled */
+		/* If we find that the peer has discovery disabled then we will
+		 * not modify whatever primary NID is currently set for this
+		 * peer. Thus, we can break out of this loop even if the peer
+		 * is not fully up to date.
+		 */
 		if (lnet_is_discovery_disabled(lp))
 			break;
 	}
-- 
1.8.3.1



More information about the lustre-devel mailing list