[lustre-discuss] Wrong --index set for OST

Tue Sep 26 10:38:45 PDT 2017

On Sep 26, 2017, at 07:35, Ben Evans <bevans at cray.com> wrote:
> 
> I'm guessing on the osts, but what you'd want to do is to find files that
> are striped to a single OST using "lfs getstripe".  You'll need one file
> per OST.
> 
> After that, you'll have to do something like iterate through the OSTs to
> find the right combo where an ls -l works for that file.  Keep track of
> what OST indexes map to what devices, because you'll be destroying them
> pretty constantly until you resolve all of them.

I don't think you need to iterate through the configuration each time,
which would take ages to do.  Rather, just do the "lfs getstripe" on a
few files, and then find which OSTs have object IDs (under the O/0/d*
directories) that match the required index.

Essentially, just make a NxN grid of "current index" vs "actual index"
and then start crossing out boxes when the "lfs getstripe" returns an
OST object that doesn't actually exist on the OST (assuming the LFSCK
run didn't mess that up too badly).

> Each time you change an OST index, you'll need to do tunefs.lustre
> --writeconf on *all* devices to make them register with the MGS again.
> 
> -Ben Evans
> 
> On 9/26/17, 1:08 AM, "lustre-discuss on behalf of rodger"
> <lustre-discuss-bounces at lists.lustre.org on behalf of
> rodger at csag.uct.ac.za> wrote:
> 
>> Dear All,
>> 
>> Apologies for nagging on this!
>> 
>> Does anyone have any insight on assessing progress of the lfsck?
>> 
>> Does anyone have experience of fixing incorrect index values on OST?
>> 
>> Regards,
>> Rodger
>> 
>> On 25/09/2017 11:21, rodger wrote:
>>> Dear All,
>>> 
>>> I'm still struggling with this. I am running an lfsck -A at present.
>>> The 
>>> status update is reporting:
>>> 
>>> layout_mdts_init: 0
>>> layout_mdts_scanning-phase1: 1
>>> layout_mdts_scanning-phase2: 0
>>> layout_mdts_completed: 0
>>> layout_mdts_failed: 0
>>> layout_mdts_stopped: 0
>>> layout_mdts_paused: 0
>>> layout_mdts_crashed: 0
>>> layout_mdts_partial: 0
>>> layout_mdts_co-failed: 0
>>> layout_mdts_co-stopped: 0
>>> layout_mdts_co-paused: 0
>>> layout_mdts_unknown: 0
>>> layout_osts_init: 0
>>> layout_osts_scanning-phase1: 0
>>> layout_osts_scanning-phase2: 12
>>> layout_osts_completed: 0
>>> layout_osts_failed: 30
>>> layout_osts_stopped: 0
>>> layout_osts_paused: 0
>>> layout_osts_crashed: 0
>>> layout_osts_partial: 0
>>> layout_osts_co-failed: 0
>>> layout_osts_co-stopped: 0
>>> layout_osts_co-paused: 0
>>> layout_osts_unknown: 0
>>> layout_repaired: 82358851
>>> namespace_mdts_init: 0
>>> namespace_mdts_scanning-phase1: 1
>>> namespace_mdts_scanning-phase2: 0
>>> namespace_mdts_completed: 0
>>> namespace_mdts_failed: 0
>>> namespace_mdts_stopped: 0
>>> namespace_mdts_paused: 0
>>> namespace_mdts_crashed: 0
>>> namespace_mdts_partial: 0
>>> namespace_mdts_co-failed: 0
>>> namespace_mdts_co-stopped: 0
>>> namespace_mdts_co-paused: 0
>>> namespace_mdts_unknown: 0
>>> namespace_osts_init: 0
>>> namespace_osts_scanning-phase1: 0
>>> namespace_osts_scanning-phase2: 0
>>> namespace_osts_completed: 0
>>> namespace_osts_failed: 0
>>> namespace_osts_stopped: 0
>>> namespace_osts_paused: 0
>>> namespace_osts_crashed: 0
>>> namespace_osts_partial: 0
>>> namespace_osts_co-failed: 0
>>> namespace_osts_co-stopped: 0
>>> namespace_osts_co-paused: 0
>>> namespace_osts_unknown: 0
>>> namespace_repaired: 68265278
>>> 
>>> with the layout_repaired and namespace_repaired values ticking up at
>>> about 10000 per second.
>>> 
>>> Is the layout_osts_failed value of 30 a concern?
>>> 
>>> Is there any way to know how far along it is?
>>> 
>>> I am also seeing many messages similar to the following in
>>> /var/log/messages on the mdt and oss with OST0000:
>>> 
>>> Sep 25 10:48:00 mds0l210 kernel: LustreError:
>>> 5934:0:(osp_precreate.c:903:osp_precreate_cleanup_orphans())
>>> terra-OST0000-osc-MDT0000: cannot cleanup orphans: rc = -22
>>> Sep 25 10:48:00 mds0l210 kernel: LustreError:
>>> 5934:0:(osp_precreate.c:903:osp_precreate_cleanup_orphans()) Skipped
>>> 599 
>>> previous similar messages
>>> Sep 25 10:48:30 mds0l210 kernel: LustreError:
>>> 6137:0:(fld_handler.c:256:fld_server_lookup()) srv-terra-MDT0000:
>>> Cannot 
>>> find sequence 0x8: rc = -2
>>> Sep 25 10:48:30 mds0l210 kernel: LustreError:
>>> 6137:0:(fld_handler.c:256:fld_server_lookup()) Skipped 16593 previous
>>> similar messages
>>> Sep 25 10:58:01 mds0l210 kernel: LustreError:
>>> 5934:0:(osp_precreate.c:903:osp_precreate_cleanup_orphans())
>>> terra-OST0000-osc-MDT0000: cannot cleanup orphans: rc = -22
>>> Sep 25 10:58:01 mds0l210 kernel: LustreError:
>>> 5934:0:(osp_precreate.c:903:osp_precreate_cleanup_orphans()) Skipped
>>> 599 
>>> previous similar messages
>>> Sep 25 10:58:57 mds0l210 kernel: LustreError:
>>> 6137:0:(fld_handler.c:256:fld_server_lookup()) srv-terra-MDT0000:
>>> Cannot 
>>> find sequence 0x8: rc = -2
>>> Sep 25 10:58:57 mds0l210 kernel: LustreError:
>>> 6137:0:(fld_handler.c:256:fld_server_lookup()) Skipped 40309 previous
>>> similar messages
>>> 
>>> Do these indicate that the process is not working?
>>> 
>>> Regards,
>>> Rodger
>>> 
>>> 
>>> 
>>> On 23/09/2017 15:07, rodger wrote:
>>>> Dear All,
>>>> 
>>>> In the process of upgrading 1.8.x to 2.x I've messed up a number of
>>>> the index values for OSTs by running tune2fs with the --index value
>>>> set. To compound matters while trying to get the OSTs to mount I
>>>> erased the last_rcvd files on the OSTs. I'm looking for a way to
>>>> confirm what the index should be for each device. Part of the reason
>>>> for my difficulty is that in the evolution of the filesystem some OSTs
>>>> were decommissioned and so the full set no longer has a sequential set
>>>> of index values. In practicing for the upgrade the trial sets that I
>>>> created did have nice neat sequential indexes and the process I
>>>> developed broke when I used the real data. :-(
>>>> 
>>>> The result is that although the lustre filesystem mounts and all
>>>> directories appear to be listed files in directories mostly have
>>>> question marks for attributes and are not available for access. I'm
>>>> assuming this is because the index for the OST holding the file is
>>>> wrong.
>>>> 
>>>> Any pointers to recovery would be much appreciated!
>>>> 
>>>> Regards,
>>>> Rodger
>>>> _______________________________________________
>>>> lustre-discuss mailing list
>>>> lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>> _______________________________________________
>>> lustre-discuss mailing list
>>> lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation