[lustre-devel] [PATCH 09/19] lnet: Reset ni_ping_count only on receive
James Simmons
jsimmons at infradead.org
Sun Nov 28 15:27:44 PST 2021
From: Chris Horn <chris.horn at hpe.com>
The lnet_ni:ni_ping_count is currently reset on every (healthy) tx.
We should only reset it when receiving a message over the NI. Taking
net_lock 0 on every tx results in a performance loss for certain
workloads.
Fixes: 885dab4e09 ("lnet: Recover local NI w/exponential backoff interval")
HPE-bug-id: LUS-10427
WC-bug-id: https://jira.whamcloud.com/browse/LU-15102
Lustre-commit: 9cc0a5ff5fc8f45aa ("LU-15102 lnet: Reset ni_ping_count only on receive")
Signed-off-by: Chris Horn <chris.horn at hpe.com>
Reviewed-on: https://review.whamcloud.com/45235
Reviewed-by: Serguei Smirnov <ssmirnov at whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh at hpe.com>
Reviewed-by: Oleg Drokin <green at whamcloud.com>
Signed-off-by: James Simmons <jsimmons at infradead.org>
---
net/lnet/lnet/lib-msg.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/lnet/lnet/lib-msg.c b/net/lnet/lnet/lib-msg.c
index 3c8b7c3..12768b2 100644
--- a/net/lnet/lnet/lib-msg.c
+++ b/net/lnet/lnet/lib-msg.c
@@ -888,8 +888,6 @@
* faster recovery.
*/
lnet_inc_healthv(&ni->ni_healthv, lnet_health_sensitivity);
- lnet_net_lock(0);
- ni->ni_ping_count = 0;
/* It's possible msg_txpeer is NULL in the LOLND
* case. Only increment the peer's health if we're
* receiving a message from it. It's the only sure way to
@@ -898,7 +896,9 @@
* as indication that the router is fully healthy.
*/
if (lpni && msg->msg_rx_committed) {
+ lnet_net_lock(0);
lpni->lpni_ping_count = 0;
+ ni->ni_ping_count = 0;
/* If we're receiving a message from the router or
* I'm a router, then set that lpni's health to
* maximum so we can commence communication
@@ -925,8 +925,8 @@
&the_lnet.ln_mt_peerNIRecovq,
ktime_get_seconds());
}
+ lnet_net_unlock(0);
}
- lnet_net_unlock(0);
/* we can finalize this message */
return -1;
--
1.8.3.1
More information about the lustre-devel
mailing list