[lustre-devel] [PATCH 09/19] lnet: Reset ni_ping_count only on receive

James Simmons jsimmons at infradead.org
Sun Nov 28 15:27:44 PST 2021


From: Chris Horn <chris.horn at hpe.com>

The lnet_ni:ni_ping_count is currently reset on every (healthy) tx.
We should only reset it when receiving a message over the NI. Taking
net_lock 0 on every tx results in a performance loss for certain
workloads.

Fixes: 885dab4e09 ("lnet: Recover local NI w/exponential backoff interval")
HPE-bug-id: LUS-10427
WC-bug-id: https://jira.whamcloud.com/browse/LU-15102
Lustre-commit: 9cc0a5ff5fc8f45aa ("LU-15102 lnet: Reset ni_ping_count only on receive")
Signed-off-by: Chris Horn <chris.horn at hpe.com>
Reviewed-on: https://review.whamcloud.com/45235
Reviewed-by: Serguei Smirnov <ssmirnov at whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh at hpe.com>
Reviewed-by: Oleg Drokin <green at whamcloud.com>
Signed-off-by: James Simmons <jsimmons at infradead.org>
---
 net/lnet/lnet/lib-msg.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/lnet/lnet/lib-msg.c b/net/lnet/lnet/lib-msg.c
index 3c8b7c3..12768b2 100644
--- a/net/lnet/lnet/lib-msg.c
+++ b/net/lnet/lnet/lib-msg.c
@@ -888,8 +888,6 @@
 		 * faster recovery.
 		 */
 		lnet_inc_healthv(&ni->ni_healthv, lnet_health_sensitivity);
-		lnet_net_lock(0);
-		ni->ni_ping_count = 0;
 		/* It's possible msg_txpeer is NULL in the LOLND
 		 * case. Only increment the peer's health if we're
 		 * receiving a message from it. It's the only sure way to
@@ -898,7 +896,9 @@
 		 * as indication that the router is fully healthy.
 		 */
 		if (lpni && msg->msg_rx_committed) {
+			lnet_net_lock(0);
 			lpni->lpni_ping_count = 0;
+			ni->ni_ping_count = 0;
 			/* If we're receiving a message from the router or
 			 * I'm a router, then set that lpni's health to
 			 * maximum so we can commence communication
@@ -925,8 +925,8 @@
 								     &the_lnet.ln_mt_peerNIRecovq,
 								     ktime_get_seconds());
 			}
+			lnet_net_unlock(0);
 		}
-		lnet_net_unlock(0);
 
 		/* we can finalize this message */
 		return -1;
-- 
1.8.3.1



More information about the lustre-devel mailing list