[lustre-devel] lustre:pcc: Sanity-pcc 7a test hang(Both on Aarch64 and X86_64) discussion

Fri Mar 11 00:18:43 PST 2022

Hi,

Recently we've worked on the bug https://jira.whamcloud.com/browse/LU-14346.
This bug will make the *mmap write* hang forever. This one is first
occurring on Aarch64, but if we do a small change
<https://github.com/kevinzs2048/devbox/blob/master/notes/lustre/pcc_sanity-pcc-7a-analysis.md#reproduced-on-x86_64-with-a-small-change>,
we *can easily reproduce it on X86_64*. For more details analysis of this
bug, you can also check the link
<https://github.com/kevinzs2048/devbox/blob/master/notes/lustre/pcc_sanity-pcc-7a-analysis.md>
.

The hang location is here
<https://github.com/lustre/lustre-release/blob/master/lustre/tests/multiop.c#L725>
as
below:
==============
    case 'W':
        for (i = 0; i < mmap_len && mmap_ptr; i += 4096)
            mmap_ptr[i] += junk++;
        break;
===============

*Bug Analysis - different behavior when run **mmap_ptr[i] += junk++ on
different platform.*
Traditionally, this process is:
1. read from mmap_ptr[i]first(Execute the read page fault)
2. Write a value to the same page(execute the page_mkwrite to change the
page to writable).

But on different platforms, it executes quite differently.
On aarch64 platform: do_page_fault, no FAULT_FLAG_WRITE set, so
handle_pte_fault will call do_read_fault

   - do_read_fault:
               __do_fault -> call ll_fault, get a page from pcc_fault
               finish_fault(map the returned page to page tables)
               unlock_page
               vmf->flags is VM_FAULT_LOCKED
   - call do_wp_page --> do_page_mkwrite --> ll_page_mkwrite

On X86_64 platform, the mechanism is different. On X86_64, do_page_fault,
with *FAULT_FLAG_WRITE set*, so handle_pte_fault will call *do_shared_fault*
.

   - do_shared_fault
      -  __do_fault -> call ll_fault, get a page from pcc_fault
      - do_page_mkwrite-> call ll_page_mkwrite
      - finish_fault(map the returned page to page tables)
      - fault_dirty_shared_page

*Bug Analysis: why hang forever:*
Also can check:
https://github.com/kevinzs2048/devbox/blob/master/notes/lustre/pcc_sanity-pcc-7a-analysis.md#kernel-do_page_fault-process-analysis
for more details.

do_page_mkwrite--->ll_page_mkwrite.
    Insert the issue 0x1412 OBD_FAIL_LLITE_PCC_DETACH_MKWRITE.
    Return with VM_FAULT_RETRY | VM_FAULT_NOPAGE
    RETRY again, due to PTE is not NULL, vmf->flags FAULT_FLAG_WRITE, will
call do_wp_page again.
So that next time we will enter into do_page_mkwrite again. hanging forever.

*Seek a good solution*
As the above code snippet shows, *we want to let the kernel retry the mmap
write (->fault() and ->page_mkwrite).*
In handle_pte_fault, if there is no page or the page is not mapped(no PTE
found), then
 __do_page_fault will try the memory fault handling.

The easy fix here is to* remove the page and page table entry when we do
fail injection in pcc_page_mkwrite.* But I don't find a good method to
execute this, so list the info here and ask for community help.

Some tried fix is:
add function: *generic_error_remove_page*, but the mapped page still can
not be unmapped successfully. The error log is here
<https://github.com/kevinzs2048/devbox/blob/master/notes/lustre/pcc_sanity-pcc-7a-analysis.md#solution>
.

Since I'm a newbie to Lustre and not quite familiar with the memory
management process, so please give some advice on this bug fix. Thanks in
advance.

*Best Regards*

*Kevin Zhao*

Tech Lead, LDCG Cloud Infrastructure

Linaro Vertical Technologies

IRC(freenode): kevinz

Slack(kubernetes.slack.com): kevinz

kevin.zhao at linaro.org | Mobile/Direct/Wechat:  +86 18818270915
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20220311/202a0a10/attachment.html>