[lustre-discuss] Stray files after failed lfs_migrate

Angelos Ching angelosching at clustertech.com
Thu Mar 4 17:55:42 PST 2021


Thanks Rick,

I've always assumed those data resides inside MDT, but your explanation 
makes sense since the files are temporarily files used by mysqld which 
might have been deleted while the files were being migrated. Since they 
are not needed anyway, I just unlink-ed them (as rm will stat the file 
before removal and it outright fails).

IIRC lfsck needs to be done with the whole volume offline?

Best regards,
Angelos

On 04/03/2021 06:10, Mohr, Rick via lustre-discuss wrote:
> Angelos,
>
> If a file still existed on the MDS but its data on the OST had somehow been removed, then you might see symptoms like those you described.  (stat fails because info can't be retrieved from the ost, but lfs getstripe can still query layout info from the mds.).  But if that is the case, I can't really say how it might have happened in the first place.
>
> Have you tried running lfsck to look for consistency problems?
>
> --Rick
>
>
> On 3/2/21, 5:24 AM, "lustre-discuss on behalf of Angelos Ching via lustre-discuss" <lustre-discuss-bounces at lists.lustre.org on behalf of lustre-discuss at lists.lustre.org> wrote:
>
>      Dear all,
>
>      I was dealing with some OST migration using lfs_migrate and things went
>      mostly fine albeit for a few files that might have been in use during
>      the migration:
>
>      > # ls
>      > ls: cannot access ibleTHWm: No such file or directory
>      > ls: cannot access ib7rP0qy: No such file or directory
>      > ls: cannot access ib3AQ9vK: No such file or directory
>      > ls: cannot access ib30N1p9: No such file or directory
>      > ib30N1p9  ib3AQ9vK  ib7rP0qy  ibleTHWm
>      > # stat ib30N1p9
>      > stat: cannot stat ‘ib30N1p9’: No such file or directory
>      > # lfs getstripe ib30N1p9
>      > ib30N1p9
>      > lmm_stripe_count:  1
>      > lmm_stripe_size:   1048576
>      > lmm_pattern:       raid0
>      > lmm_layout_gen:    0
>      > lmm_stripe_offset: 1
>      >     obdidx         objid         objid         group
>      >          1          71909438        0x449403e 0
>      The files couldn't be stat'ed but still returns upon lfs getstripe.
>
>      The same error appears on all clients and I've tried unmounting and
>      remounting the MDT on the server side already.
>
>      Any idea what might have been corrupted and what could be the fix?
>
>      Cheers,
>
>      --
>      Angelos Ching
>      ClusterTech Limited
>
>      Tel     : +852-2655-6138
>      Fax     : +852-2994-2101
>      Address	: Unit 211-213, Lakeside 1, 8 Science Park West Ave., Shatin, Hong Kong
>
>      Got praises or room for improvements? http://bit.ly/TellAngelos
>
>      ********************************************************************************
>      The information contained in this e-mail and its attachments is confidential and
>      intended solely for the specified addressees. If you have received this email in
>      error, please do not read, copy, distribute, disclose or use any information of
>      this email in any way and please immediately notify the sender and delete this
>      email. Thank you for your cooperation.
>      ********************************************************************************
>
>      _______________________________________________
>      lustre-discuss mailing list
>      lustre-discuss at lists.lustre.org
>      http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


More information about the lustre-discuss mailing list