[lustre-discuss] Stray files after failed lfs_migrate

Spitz, Cory James cory.spitz at hpe.com
Fri Mar 5 07:18:01 PST 2021


> lfsck needs to be done with the whole volume offline?
No, in Lustre 2.x lfsck is an online tool.

Per https://doc.lustre.org/lustre_manual.xhtml#idm139675950896912:
Disaster recovery tool: The Lustre file system provides an online distributed file system check (LFSCK) that can restore consistency between storage components in case of a major file system error. A Lustre file system can operate even in the presence of file system inconsistencies, and LFSCK can run while the filesystem is in use, so LFSCK is not required to complete before returning the file system to production.

-Cory

On 3/4/21, 7:56 PM, "lustre-discuss on behalf of Angelos Ching via lustre-discuss" <lustre-discuss-bounces at lists.lustre.org on behalf of lustre-discuss at lists.lustre.org> wrote:

    Thanks Rick,

    I've always assumed those data resides inside MDT, but your explanation 
    makes sense since the files are temporarily files used by mysqld which 
    might have been deleted while the files were being migrated. Since they 
    are not needed anyway, I just unlink-ed them (as rm will stat the file 
    before removal and it outright fails).

    IIRC lfsck needs to be done with the whole volume offline?

    Best regards,
    Angelos

    On 04/03/2021 06:10, Mohr, Rick via lustre-discuss wrote:
    > Angelos,
    >
    > If a file still existed on the MDS but its data on the OST had somehow been removed, then you might see symptoms like those you described.  (stat fails because info can't be retrieved from the ost, but lfs getstripe can still query layout info from the mds.).  But if that is the case, I can't really say how it might have happened in the first place.
    >
    > Have you tried running lfsck to look for consistency problems?
    >
    > --Rick
    >
    >
    > On 3/2/21, 5:24 AM, "lustre-discuss on behalf of Angelos Ching via lustre-discuss" <lustre-discuss-bounces at lists.lustre.org on behalf of lustre-discuss at lists.lustre.org> wrote:
    >
    >      Dear all,
    >
    >      I was dealing with some OST migration using lfs_migrate and things went
    >      mostly fine albeit for a few files that might have been in use during
    >      the migration:
    >
    >      > # ls
    >      > ls: cannot access ibleTHWm: No such file or directory
    >      > ls: cannot access ib7rP0qy: No such file or directory
    >      > ls: cannot access ib3AQ9vK: No such file or directory
    >      > ls: cannot access ib30N1p9: No such file or directory
    >      > ib30N1p9  ib3AQ9vK  ib7rP0qy  ibleTHWm
    >      > # stat ib30N1p9
    >      > stat: cannot stat ‘ib30N1p9’: No such file or directory
    >      > # lfs getstripe ib30N1p9
    >      > ib30N1p9
    >      > lmm_stripe_count:  1
    >      > lmm_stripe_size:   1048576
    >      > lmm_pattern:       raid0
    >      > lmm_layout_gen:    0
    >      > lmm_stripe_offset: 1
    >      >     obdidx         objid         objid         group
    >      >          1          71909438        0x449403e 0
    >      The files couldn't be stat'ed but still returns upon lfs getstripe.
    >
    >      The same error appears on all clients and I've tried unmounting and
    >      remounting the MDT on the server side already.
    >
    >      Any idea what might have been corrupted and what could be the fix?
    >
    >      Cheers,
    >
    >      --
    >      Angelos Ching
    >      ClusterTech Limited
    >
    >      Tel     : +852-2655-6138
    >      Fax     : +852-2994-2101
    >      Address	: Unit 211-213, Lakeside 1, 8 Science Park West Ave., Shatin, Hong Kong
    >
    >      Got praises or room for improvements? http://bit.ly/TellAngelos 
    >
    >      ********************************************************************************
    >      The information contained in this e-mail and its attachments is confidential and
    >      intended solely for the specified addressees. If you have received this email in
    >      error, please do not read, copy, distribute, disclose or use any information of
    >      this email in any way and please immediately notify the sender and delete this
    >      email. Thank you for your cooperation.
    >      ********************************************************************************
    >
    >      _______________________________________________
    >      lustre-discuss mailing list
    >      lustre-discuss at lists.lustre.org
    >      http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org 
    >
    > _______________________________________________
    > lustre-discuss mailing list
    > lustre-discuss at lists.lustre.org
    > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org 
    _______________________________________________
    lustre-discuss mailing list
    lustre-discuss at lists.lustre.org
    http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org 



More information about the lustre-discuss mailing list