[lustre-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers

Jan Kara jack at suse.cz
Mon Feb 6 01:24:15 PST 2017


On Fri 03-02-17 15:20:54, Andrew Morton wrote:
> On Fri,  3 Feb 2017 16:07:29 +0100 Jan Kara <jack at suse.cz> wrote:
> 
> > Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> > code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> > from ->page_mkwrite is completely unhandled by the mm code and results
> > in locking and writeably mapping the page which definitely is not what
> > the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> > filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> > results in bailing out from the fault code, the CPU then retries the
> > access, and we fault again effectively doing what the handler wanted.
> 
> I'm not getting any sense of the urgency of this fix.  The bug *sounds*
> bad?  Which kernel versions need fixing?

So I did more analysis of GFS2 and Lustre behavior. AFAICS GFS2 returns
EAGAIN only for truncated page, when we then return with VM_FAULT_RETRY,
do_page_mkwrite() locks the page, sees it is truncated and bails out
properly thus silently fixes up the problem. The Lustre bug looks like it
could actually result in some real problems and the bug is there since the
initial commit in which Lustre was added in 3.11 (d7e09d0397e84).

So overall the issue doesn't look like too serious currently but it is
certainly a serious bug waiting to happen.

								Honza

-- 
Jan Kara <jack at suse.com>
SUSE Labs, CR


More information about the lustre-devel mailing list