[lustre-discuss] [EXTERNAL] Accessing files with bad PFL causing MDS kernel panics

Mohr, Rick mohrrf at ornl.gov
Tue Oct 25 14:50:35 PDT 2022


Nate,

For the example layout you attached, it looks like the file does not have any data in the components with the messed up extent_end value.  Have you tried using "lfs setstripe --component-del" to delete just those messed up components and see if you can then access the data?

--Rick


On 10/25/22, 4:43 PM, "lustre-discuss on behalf of Nathan Crawford" <lustre-discuss-bounces at lists.lustre.org on behalf of nrcrawfo at uci.edu> wrote:

    Hi All,
      I'm looking for possible work-arounds to recover data from some mis-migrated files (as seen in  LU-16152). Basically, there's a bug in "lfs setstripe --yaml" where extent start/end values in the yaml file >= 2GiB overflow to 16 EiB - 2 GiB.

      Using lfs_migrate, I re-striped many files in directories with a default striping pattern containing these values.  I'm pretty sure that the data exists (was trying to purge an older OST, and disk usage on the other OSTs increased as the purged OST decreased), and an lfsck procedure happily returns after a day or so. Unfortunately, attempts to access or re-migrate the files triggers a kernel panic on the MDS with:

    LustreError: 12576:0:(osd_io.c:311:kmem_to_page()) ASSERTION( !((unsigned long)addr & ~(~(((1UL) << 12)-1))) ) failed:
    LustreError: 12576:0:(osd_io.c:311:kmem_to_page()) LBUG

    Kernel panic - not syncing: LBUG


     The servers are lustre 2.12.8 on OpenZFS 0.8.5 on CentOS 7.9. The output from "lfs getstripe -v badfile" is attached.

      I can use lfs find to search for files with these bad extent endpoint values, then move them to a quarantine area on the same FS. This will allow the rest of the system to stay up (hopefully) but recovering the data is still needed.

    Thanks!
    Nate

    -- 
    Dr. Nathan Crawford              nathan.crawford at uci.edu
    Director of Scientific Computing
    School of Physical Sciences
    164 Rowland Hall                 Office: 152 Rowland Hall
    University of California, Irvine  Phone: 949-824-1380
    Irvine, CA 92697-2025, USA



More information about the lustre-discuss mailing list