[lustre-devel] [PATCH 05/19] lnet: libcfs: add timeout to cfs_race() to fix race
James Simmons
jsimmons at infradead.org
Sun Nov 28 15:27:40 PST 2021
From: Alex Zhuravlev <bzzz at whamcloud.com>
there is no guarantee for the branches in cfs_race() to be executed
in strict order, thus it's possible that the second branch (with
cfs_race_state=1) is executed before the first branch and then another
thread executing the first branch gets stuck.
this construction is used for testing only and as a
workaround it's enough to timeout.
WC-bug-id: https://jira.whamcloud.com/browse/LU-13358
Lustre-commit: 2d2d381f35ee00431 ("LU-13358 libcfs: add timeout to cfs_race() to fix race")
Signed-off-by: Alex Zhuravlev <bzzz at whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43161
Reviewed-by: James Simmons <jsimmons at infradead.org>
Reviewed-by: Neil Brown <neilb at suse.de>
Reviewed-by: Oleg Drokin <green at whamcloud.com>
Signed-off-by: James Simmons <jsimmons at infradead.org>
---
include/linux/libcfs/libcfs_fail.h | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/include/linux/libcfs/libcfs_fail.h b/include/linux/libcfs/libcfs_fail.h
index 45166c5..731401b 100644
--- a/include/linux/libcfs/libcfs_fail.h
+++ b/include/linux/libcfs/libcfs_fail.h
@@ -213,8 +213,14 @@ static inline void cfs_race_wait(u32 id)
cfs_race_state = 0;
CERROR("cfs_race id %x sleeping\n", id);
- rc = wait_event_interruptible(cfs_race_waitq,
- cfs_race_state != 0);
+ /*
+ * XXX: don't wait forever as there is no guarantee
+ * that this branch is executed first. for testing
+ * purposes this construction works good enough
+ */
+ rc = wait_event_interruptible_timeout(cfs_race_waitq,
+ cfs_race_state != 0,
+ 5 * HZ);
CERROR("cfs_fail_race id %x awake: rc=%d\n", id, rc);
}
}
--
1.8.3.1
More information about the lustre-devel
mailing list