[lustre-devel] [PATCH 36/50] lnet: Avoid peer NI recovery for local interface

James Simmons jsimmons at infradead.org
Sun Mar 20 06:30:50 PDT 2022


From: Chris Horn <chris.horn at hpe.com>

If a MR peer has a MR peer entry for itself (can happen if manually
created or discovery is run on itself for some reason), then it is
possible for it to put its own interfaces into peer recovery. Problems
with local interfaces should be handled via local NI recovery.

HPE-bug-id: LUS-10661
WC-bug-id: https://jira.whamcloud.com/browse/LU-15398
Lustre-commit: fb5d7036ec356c825 ("LU-15398 lnet: Avoid peer NI recovery for local interface")
Signed-off-by: Chris Horn <chris.horn at hpe.com>
Reviewed-on: https://review.whamcloud.com/45933
Reviewed-by: Serguei Smirnov <ssmirnov at whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh at hpe.com>
Reviewed-by: Oleg Drokin <green at whamcloud.com>
Signed-off-by: James Simmons <jsimmons at infradead.org>
---
 net/lnet/lnet/lib-msg.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/lnet/lnet/lib-msg.c b/net/lnet/lnet/lib-msg.c
index 88f017b..f476975 100644
--- a/net/lnet/lnet/lib-msg.c
+++ b/net/lnet/lnet/lib-msg.c
@@ -877,6 +877,12 @@
 			if (!lnet_isrouter(lpni))
 				handle_remote_health = false;
 		}
+		/* Do not put my interfaces into peer NI recovery. They should
+		 * be handled with local NI recovery.
+		 */
+		if (handle_remote_health && lpni &&
+		    lnet_nid_to_ni_locked(&lpni->lpni_nid, 0))
+			handle_remote_health = false;
 		lnet_net_unlock(0);
 	}
 
-- 
1.8.3.1



More information about the lustre-devel mailing list