[lustre-devel] lustre:pcc: Sanity-pcc 7a test hang(Both on Aarch64 and X86_64) discussion

Sun Mar 13 18:41:37 PDT 2022

Hi  Andreas,

Great! Thanks for that info, will take a look at that.

On Sat, 12 Mar 2022 at 08:02, Andreas Dilger <adilger at whamcloud.com> wrote:

> Kevin,
> Qian  has a patch https://review.whamcloud.com/40092 "LU-14003
> <https://jira.whamcloud.com/browse/LU-14003> pcc: rework PCC mmap
> implementation" that is changing the PCC MMAP code significantly, but is
> waiting for the 2.16.0 feature landing window to open.  It needs to be
> refreshed, but it would be helpful if you could take a look through that
> patch to see if it would resolve the issue you are seeing.
>
> On Mar 11, 2022, at 01:18, Kevin Zhao via lustre-devel <
> lustre-devel at lists.lustre.org> wrote:
>
> Hi,
>
> Recently we've worked on the bug
> https://jira.whamcloud.com/browse/LU-14346. This bug will make the *mmap
> write* hang forever. This one is first occurring on Aarch64, but if we do
> a small change
> <https://github.com/kevinzs2048/devbox/blob/master/notes/lustre/pcc_sanity-pcc-7a-analysis.md#reproduced-on-x86_64-with-a-small-change>,
> we *can easily reproduce it on X86_64*. For more details analysis of this
> bug, you can also check the link
> <https://github.com/kevinzs2048/devbox/blob/master/notes/lustre/pcc_sanity-pcc-7a-analysis.md>
> .
>
> The hang location is here
> <https://github.com/lustre/lustre-release/blob/master/lustre/tests/multiop.c#L725> as
> below:
> ==============
>     case 'W':
>         for (i = 0; i < mmap_len && mmap_ptr; i += 4096)
>             mmap_ptr[i] += junk++;
>         break;
> ===============
>
> *Bug Analysis - different behavior when run **mmap_ptr[i] += junk++ on
> different platform.*
> Traditionally, this process is:
> 1. read from mmap_ptr[i]first(Execute the read page fault)
> 2. Write a value to the same page(execute the page_mkwrite to change the
> page to writable).
>
> But on different platforms, it executes quite differently.
> On aarch64 platform: do_page_fault, no FAULT_FLAG_WRITE set, so
> handle_pte_fault will call do_read_fault
>
>    - do_read_fault:
>                __do_fault -> call ll_fault, get a page from pcc_fault
>                finish_fault(map the returned page to page tables)
>                unlock_page
>                vmf->flags is VM_FAULT_LOCKED
>    - call do_wp_page --> do_page_mkwrite --> ll_page_mkwrite
>
> On X86_64 platform, the mechanism is different. On X86_64, do_page_fault,
> with * FAULT_FLAG_WRITE set*, so handle_pte_fault will call
> *do_shared_fault*.
>
>    - do_shared_fault
>       -  __do_fault -> call ll_fault, get a page from pcc_fault
>       - do_page_mkwrite-> call ll_page_mkwrite
>       - finish_fault(map the returned page to page tables)
>       - fault_dirty_shared_page
>
> *Bug Analysis: why hang forever:*
> Also can check:
> https://github.com/kevinzs2048/devbox/blob/master/notes/lustre/pcc_sanity-pcc-7a-analysis.md#kernel-do_page_fault-process-analysis
> for more details.
>
> do_page_mkwrite--->ll_page_mkwrite.
>     Insert the issue 0x1412 OBD_FAIL_LLITE_PCC_DETACH_MKWRITE.
>     Return with VM_FAULT_RETRY | VM_FAULT_NOPAGE
>     RETRY again, due to PTE is not NULL, vmf->flags FAULT_FLAG_WRITE, will
> call do_wp_page again.
> So that next time we will enter into do_page_mkwrite again. hanging
> forever.
>
> *Seek a good solution*
> As the above code snippet shows, *we want to let the kernel retry the
> mmap write (->fault() and ->page_mkwrite).*
> In handle_pte_fault, if there is no page or the page is not mapped(no PTE
> found), then
>  __do_page_fault will try the memory fault handling.
>
> The easy fix here is to* remove the page and page table entry when we do
> fail injection in pcc_page_mkwrite.* But I don't find a good method to
> execute this, so list the info here and ask for community help.
>
> Some tried fix is:
> add function: *generic_error_remove_page*, but the mapped page still can
> not be unmapped successfully. The error log is here
> <https://github.com/kevinzs2048/devbox/blob/master/notes/lustre/pcc_sanity-pcc-7a-analysis.md#solution>
> .
>
> Since I'm a newbie to Lustre and not quite familiar with the memory
> management process, so please give some advice on this bug fix. Thanks in
> advance.
>
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Whamcloud
>
>
>
>
>
>
>
>

-- 
*Best Regards*

*Kevin Zhao*

Tech Lead, LDCG Cloud Infrastructure

Linaro Vertical Technologies

IRC(freenode): kevinz

Slack(kubernetes.slack.com): kevinz

kevin.zhao at linaro.org | Mobile/Direct/Wechat:  +86 18818270915
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20220314/bd842937/attachment.html>