[lustre-devel] [PATCH 194/622] lustre: ptlrpc: connect vs import invalidate race

James Simmons jsimmons at infradead.org
Thu Feb 27 13:11:02 PST 2020


From: Andriy Skulysh <c17819 at cray.com>

Connect can't be sent while import invalidate is
in progress, thus it leaves the import in not
initialized state.

Don't allow reconnect in evicted state.

Cray-bug-id: LUS-6322
WC-bug-id: https://jira.whamcloud.com/browse/LU-7558
Lustre-commit: b1827ff1da82 ("LU-7558 ptlrpc: connect vs import invalidate race")
Signed-off-by: Andriy Skulysh <c17819 at cray.com>
Reviewed-by: Alexander Boyko <c17825 at cray.com>
Reviewed-by: Andrew Perepechko <c17827 at cray.com>
Reviewed-on: https://review.whamcloud.com/33718
Reviewed-by: Mike Pershin <mpershin at whamcloud.com>
Reviewed-by: Oleg Drokin <green at whamcloud.com>
Signed-off-by: James Simmons <jsimmons at infradead.org>
---
 fs/lustre/include/obd_support.h | 1 +
 fs/lustre/ptlrpc/import.c       | 6 ++++++
 fs/lustre/ptlrpc/recover.c      | 2 ++
 3 files changed, 9 insertions(+)

diff --git a/fs/lustre/include/obd_support.h b/fs/lustre/include/obd_support.h
index c2db38f..5ff270a 100644
--- a/fs/lustre/include/obd_support.h
+++ b/fs/lustre/include/obd_support.h
@@ -353,6 +353,7 @@
 #define OBD_FAIL_PTLRPC_LONG_REQ_UNLINK			0x51b
 #define OBD_FAIL_PTLRPC_LONG_BOTH_UNLINK		0x51c
 #define OBD_FAIL_PTLRPC_BULK_ATTACH      0x521
+#define OBD_FAIL_PTLRPC_CONNECT_RACE			0x531
 
 #define OBD_FAIL_OBD_PING_NET				0x600
 /*	OBD_FAIL_OBD_LOG_CANCEL_NET	0x601 obsolete since 1.5 */
diff --git a/fs/lustre/ptlrpc/import.c b/fs/lustre/ptlrpc/import.c
index 867aff6..df6c459 100644
--- a/fs/lustre/ptlrpc/import.c
+++ b/fs/lustre/ptlrpc/import.c
@@ -38,6 +38,7 @@
 #define DEBUG_SUBSYSTEM S_RPC
 
 #include <linux/kthread.h>
+#include <linux/delay.h>
 #include <linux/fs_struct.h>
 #include <obd_support.h>
 #include <lustre_ha.h>
@@ -273,6 +274,10 @@ void ptlrpc_invalidate_import(struct obd_import *imp)
 	if (!imp->imp_invalid || imp->imp_obd->obd_no_recov)
 		ptlrpc_deactivate_import(imp);
 
+	if (OBD_FAIL_PRECHECK(OBD_FAIL_PTLRPC_CONNECT_RACE)) {
+		OBD_RACE(OBD_FAIL_PTLRPC_CONNECT_RACE);
+		msleep(10 * MSEC_PER_SEC);
+	}
 	CFS_FAIL_TIMEOUT(OBD_FAIL_MGS_CONNECT_NET, 3 * cfs_fail_val / 2);
 	LASSERT(imp->imp_invalid);
 
@@ -615,6 +620,7 @@ int ptlrpc_connect_import(struct obd_import *imp)
 		CERROR("already connected\n");
 		return 0;
 	} else if (imp->imp_state == LUSTRE_IMP_CONNECTING ||
+		   imp->imp_state == LUSTRE_IMP_EVICTED ||
 		   imp->imp_connected) {
 		spin_unlock(&imp->imp_lock);
 		CERROR("already connecting\n");
diff --git a/fs/lustre/ptlrpc/recover.c b/fs/lustre/ptlrpc/recover.c
index 7c09c4e..ceab288 100644
--- a/fs/lustre/ptlrpc/recover.c
+++ b/fs/lustre/ptlrpc/recover.c
@@ -339,6 +339,8 @@ int ptlrpc_recover_import(struct obd_import *imp, char *new_uuid, int async)
 	if (rc)
 		goto out;
 
+	OBD_RACE(OBD_FAIL_PTLRPC_CONNECT_RACE);
+
 	rc = ptlrpc_connect_import(imp);
 	if (rc)
 		goto out;
-- 
1.8.3.1



More information about the lustre-devel mailing list