[lustre-devel] [PATCH 06/15] lnet: Correct distance calculation of local NIDs

James Simmons jsimmons at infradead.org
Wed Jul 7 12:11:07 PDT 2021


From: Chris Horn <chris.horn at hpe.com>

Multi-rail peers can have multiple local NIDs on the same net, but
LNetDist() may only identify a NID as local if it is the first one
returned by lnet_get_next_ni_locked().

We need to check all local NIs to find a match for the target NID
in LNetDist().

Add test to check LNetDist() calculation of local NIDs for a peer with
multiple NIDs on the same net.

HPE-bug-id: LUS-9964
WC-bug-id: https://jira.whamcloud.com/browse/LU-14649
Lustre-commit: 4d0162037415988b ("LU-14649 lnet: Correct distance calculation of local NIDs")
Signed-off-by: Chris Horn <hornc at cray.com>
Reviewed-on: https://review.whamcloud.com/43498
Reviewed-by: Serguei Smirnov <ssmirnov at whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko at hpe.com>
Reviewed-by: Oleg Drokin <green at whamcloud.com>
Signed-off-by: James Simmons <jsimmons at infradead.org>
---
 net/lnet/lnet/lib-move.c | 40 +++++++++++++++++++++++++++-------------
 1 file changed, 27 insertions(+), 13 deletions(-)

diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c
index 3ae0209..33d7e78 100644
--- a/net/lnet/lnet/lib-move.c
+++ b/net/lnet/lnet/lib-move.c
@@ -4981,6 +4981,7 @@ struct lnet_msg *
 	int cpt;
 	u32 order = 2;
 	struct list_head *rn_list;
+	bool matched_dstnet = false;
 
 	/*
 	 * if !local_nid_dist_zero, I don't return a distance of 0 ever
@@ -5007,27 +5008,40 @@ struct lnet_msg *
 			return local_nid_dist_zero ? 0 : 1;
 		}
 
-		if (LNET_NIDNET(ni->ni_nid) == dstnet) {
-			/*
-			 * Check if ni was originally created in
-			 * current net namespace.
-			 * If not, assign order above 0xffff0000,
-			 * to make this ni not a priority.
+		if (!matched_dstnet && LNET_NIDNET(ni->ni_nid) == dstnet) {
+			matched_dstnet = true;
+			/* We matched the destination net, but we may have
+			 * additional local NIs to inspect.
+			 *
+			 * We record the nid and order as appropriate, but
+			 * they may be overwritten if we match local NI above.
 			 */
-			if (current->nsproxy &&
-			    !net_eq(ni->ni_net_ns, current->nsproxy->net_ns))
-				order += 0xffff0000;
 			if (srcnidp)
 				*srcnidp = ni->ni_nid;
-			if (orderp)
-				*orderp = order;
-			lnet_net_unlock(cpt);
-			return 1;
+
+			if (orderp) {
+				/* Check if ni was originally created in
+				 * current net namespace.
+				 * If not, assign order above 0xffff0000,
+				 * to make this ni not a priority.
+				 */
+				if (current->nsproxy &&
+				    !net_eq(ni->ni_net_ns,
+					    current->nsproxy->net_ns))
+					*orderp = order + 0xffff0000;
+				else
+					*orderp = order;
+			}
 		}
 
 		order++;
 	}
 
+	if (matched_dstnet) {
+		lnet_net_unlock(cpt);
+		return 1;
+	}
+
 	rn_list = lnet_net2rnethash(dstnet);
 	list_for_each_entry(rnet, rn_list, lrn_list) {
 		if (rnet->lrn_net == dstnet) {
-- 
1.8.3.1



More information about the lustre-devel mailing list