[lustre-devel] [PATCH 44/49] lnet: Only recover known good peer NIs

James Simmons jsimmons at infradead.org
Wed Apr 14 21:02:36 PDT 2021


From: Chris Horn <chris.horn at hpe.com>

A peer NI should not be eligible for recovery if we've never
received a message from it.

HPE-bug-id: LUS-9109
WC-bug-id: https://jira.whamcloud.com/browse/LU-13569
Lustre-commit: 39a169cd02738a1 ("Chris Horn <chris.horn at hpe.com>")
Signed-off-by: Chris Horn <chris.horn at hpe.com>
Reviewed-on: https://review.whamcloud.com/39719
Reviewed-by: Serguei Smirnov <ssmirnov at whamcloud.com>
Reviewed-by: James Simmons <jsimmons at infradead.org>
Reviewed-by: Oleg Drokin <green at whamcloud.com>
Signed-off-by: James Simmons <jsimmons at infradead.org>
---
 net/lnet/lnet/peer.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c
index fe80b81..f9af5da 100644
--- a/net/lnet/lnet/peer.c
+++ b/net/lnet/lnet/peer.c
@@ -3994,6 +3994,14 @@ int lnet_get_peer_info(struct lnet_ioctl_peer_cfg *cfg, void __user *bulk)
 	if (atomic_read(&lpni->lpni_healthv) == LNET_MAX_HEALTH_VALUE)
 		return;
 
+	if (!lpni->lpni_last_alive) {
+		CDEBUG(D_NET,
+		       "lpni %s(%p) not eligible for recovery last alive %lld\n",
+		       libcfs_nid2str(lpni->lpni_nid), lpni,
+		       lpni->lpni_last_alive);
+		return;
+	}
+
 	if (now > lpni->lpni_last_alive + lnet_recovery_limit) {
 		CDEBUG(D_NET, "lpni %s aged out last alive %lld\n",
 		       libcfs_nid2str(lpni->lpni_nid),
-- 
1.8.3.1



More information about the lustre-devel mailing list