[lustre-devel] [PATCH 08/18] lnet: use ni fatal error when calculating net health

James Simmons jsimmons at infradead.org
Mon Jul 19 05:32:03 PDT 2021

From: Serguei Smirnov <ssmirnov at whamcloud.com>

When ni is flagged with "fatal_error" by LND, its health score
remains unaffected. This allows for the net containing such ni
to be selected for tx even if it is the only ni in this net.
Take "fatal_error" status of the ni into account when calculating
the net health score.

WC-bug-id: https://jira.whamcloud.com/browse/LU-14750
Lustre-commit: 86a69f9eb5cab3f9 ("LU-14750 lnet: use ni fatal error when calculating net health")
Signed-off-by: Serguei Smirnov <ssmirnov at whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43962
Reviewed-by: Chris Horn <chris.horn at hpe.com>
Reviewed-by: Cyril Bordage <cbordage at whamcloud.com>
Reviewed-by: Oleg Drokin <green at whamcloud.com>
Signed-off-by: James Simmons <jsimmons at infradead.org>
 net/lnet/lnet/api-ni.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c
index 687df3b..dc9020d 100644
--- a/net/lnet/lnet/api-ni.c
+++ b/net/lnet/lnet/api-ni.c
@@ -3103,11 +3103,12 @@ int lnet_get_net_healthv_locked(struct lnet_net *net)
 	struct lnet_ni *ni;
 	int best_healthv = 0;
-	int healthv;
+	int healthv, ni_fatal;
 	list_for_each_entry(ni, &net->net_ni_list, ni_netlist) {
 		healthv = atomic_read(&ni->ni_healthv);
-		if (healthv > best_healthv)
+		ni_fatal = atomic_read(&ni->ni_fatal_error_on);
+		if (!ni_fatal && healthv > best_healthv)
 			best_healthv = healthv;

More information about the lustre-devel mailing list