[lustre-devel] [PATCH 17/19] lustre: llite: mend the trunc_sem_up_write()

James Simmons jsimmons at infradead.org
Sun Nov 28 15:27:52 PST 2021

From: Bobi Jam <bobijam at whamcloud.com>

The original lli_trunc_sem replace change (commit ae9e437745) fixed a
lock scenario:

t1 (page fault)          t2 (dio read)              t3 (truncate)
|- vm_mmap_pgoff()       |- vvp_io_read_start()     |- vvp_io_setattr
|- down_write(mmap_sem)  |- down_read(trunc_sem)            _start()
|- do_map()              |- ll_direct_IO_impl()
|- vvp_io_fault_start    |- ll_get_user_pages()

                                                    |- down_write(
                         |- down_read(mmap_sem)        trunc_sem)
|- down_read(trunc_sem)

t1 waits for read semaphore of trunc_sem which is hindered by t3,
since t3 is waiting for the write semaphore while t2 take its read
semaphore,and t2 is waiting for mmap_sem which has been taken by t1,
and a deadlock ensues.

commit ae9e437745 changes the down_read(trunc_sem) to
trunc_sem_down_read_nowait() in page fault path, to make it ignore
that there is a down_write(trunc_sem) waiting, just takes the read
semaphore if no writer has taken the semaphore, and breaks the

But there is a delicacy in using wake_up_var(), wake_up_var()->
__wake_up_bit()->waitqueue_active() locklessly test for waiters on the
queue, and if it's called without explicit smp_mb() it's possible for
the waitqueue_active() to ge hoisted before the condition store such
that we'll observe an empty wait list and the waiter might not
observe the condition, and the waiter won't get woke up whereafter.

Fixes: ae9e437745 ("lustre: llite: replace lli_trunc_sem")
WC-bug-id: https://jira.whamcloud.com/browse/LU-14713
Lustre-commit: 39745c8b5493159bb ("LU-14713 llite: mend the trunc_sem_up_write()")
Signed-off-by: Bobi Jam <bobijam at whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43844
Reviewed-by: Andreas Dilger <adilger at whamcloud.com>
Reviewed-by: Neil Brown <neilb at suse.de>
Reviewed-by: Patrick Farrell <pfarrell at whamcloud.com>
Reviewed-by: Oleg Drokin <green at whamcloud.com>
Signed-off-by: James Simmons <jsimmons at infradead.org>
 fs/lustre/llite/llite_internal.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h
index 7768c99..ce7431f 100644
--- a/fs/lustre/llite/llite_internal.h
+++ b/fs/lustre/llite/llite_internal.h
@@ -365,6 +365,8 @@ static inline void trunc_sem_down_write(struct ll_trunc_sem *sem)
 static inline void trunc_sem_up_write(struct ll_trunc_sem *sem)
 	atomic_set(&sem->ll_trunc_readers, 0);
+	/* match the smp_mb() in wait_var_event()->prepare_to_wait() */
+	smp_mb();

More information about the lustre-devel mailing list