[lustre-devel] [PATCH 38/42] lnet: deadlock on LNet shutdown
James Simmons
jsimmons at infradead.org
Mon Oct 5 17:06:17 PDT 2020
From: Serguei Smirnov <ssmirnov at whamcloud.com>
Release ln_api_mutex during LNet shutdown while waiting
for zombie LNI to allow other threads to read the LNet
state updated by the shutdown and fall through, avoiding
the deadlock
WC-bug-id: https://jira.whamcloud.com/browse/LU-12233
Lustre-commit: e0c445648a38fb ("LU-12233 lnet: deadlock on LNet shutdown")
Signed-off-by: Serguei Smirnov <ssmirnov at whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39933
Reviewed-by: Chris Horn <chris.horn at hpe.com>
Reviewed-by: Amir Shehata <ashehata at whamcloud.com>
Reviewed-by: Oleg Drokin <green at whamcloud.com>
Signed-off-by: James Simmons <jsimmons at infradead.org>
---
net/lnet/lnet/api-ni.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c
index f678ae2..03473bf 100644
--- a/net/lnet/lnet/api-ni.c
+++ b/net/lnet/lnet/api-ni.c
@@ -2036,13 +2036,21 @@ static void lnet_push_target_fini(void)
}
if (!list_empty(&ni->ni_netlist)) {
+ /* Unlock mutex while waiting to allow other
+ * threads to read the LNet state and fall through
+ * to avoid deadlock
+ */
lnet_net_unlock(LNET_LOCK_EX);
+ mutex_unlock(&the_lnet.ln_api_mutex);
+
++i;
if ((i & (-i)) == i) {
CDEBUG(D_WARNING, "Waiting for zombie LNI %s\n",
libcfs_nid2str(ni->ni_nid));
}
schedule_timeout_uninterruptible(HZ);
+
+ mutex_lock(&the_lnet.ln_api_mutex);
lnet_net_lock(LNET_LOCK_EX);
continue;
}
--
1.8.3.1
More information about the lustre-devel
mailing list