[lustre-discuss] Wrong --index set for OST

Tue Sep 26 08:59:07 PDT 2017

Hi Ben,

Many thanks for the advice! I am working on scripts to do this. Thanks 
also to Peter Braam for his help!

Instead of doing an ls -l I have mounted all the osts with ldiskfs and 
built an index of the objects. I'm then looking for text type files with 
the objectid obtained from the getstripe information. I look at the 
content of the file to verify that it corresponds to the object id. From 
this I can tie the index value on the MDS to the OST that holds that 
object. At the end I will tunefs each OST with this index value.

I did find files earlier where the ls -l looks like it works but it is 
pointing to the wrong file.

We have 42 OSTs so it is a bit tedious but not unmanageable!

Regards,
Rodger

On 26/09/2017 15:35, Ben Evans wrote:
> I'm guessing on the osts, but what you'd want to do is to find files that
> are striped to a single OST using "lfs getstripe".  You'll need one file
> per OST.
> 
> After that, you'll have to do something like iterate through the OSTs to
> find the right combo where an ls -l works for that file.  Keep track of
> what OST indexes map to what devices, because you'll be destroying them
> pretty constantly until you resolve all of them.
> 
> Each time you change an OST index, you'll need to do tunefs.lustre
> --writeconf on *all* devices to make them register with the MGS again.
> 
> -Ben Evans
> 
> On 9/26/17, 1:08 AM, "lustre-discuss on behalf of rodger"
> <lustre-discuss-bounces at lists.lustre.org on behalf of
> rodger at csag.uct.ac.za> wrote:
> 
>> Dear All,
>>
>> Apologies for nagging on this!
>>
>> Does anyone have any insight on assessing progress of the lfsck?
>>
>> Does anyone have experience of fixing incorrect index values on OST?
>>
>> Regards,
>> Rodger
>>
>> On 25/09/2017 11:21, rodger wrote:
>>> Dear All,
>>>
>>> I'm still struggling with this. I am running an lfsck -A at present.
>>> The
>>> status update is reporting:
>>>
>>> layout_mdts_init: 0
>>> layout_mdts_scanning-phase1: 1
>>> layout_mdts_scanning-phase2: 0
>>> layout_mdts_completed: 0
>>> layout_mdts_failed: 0
>>> layout_mdts_stopped: 0
>>> layout_mdts_paused: 0
>>> layout_mdts_crashed: 0
>>> layout_mdts_partial: 0
>>> layout_mdts_co-failed: 0
>>> layout_mdts_co-stopped: 0
>>> layout_mdts_co-paused: 0
>>> layout_mdts_unknown: 0
>>> layout_osts_init: 0
>>> layout_osts_scanning-phase1: 0
>>> layout_osts_scanning-phase2: 12
>>> layout_osts_completed: 0
>>> layout_osts_failed: 30
>>> layout_osts_stopped: 0
>>> layout_osts_paused: 0
>>> layout_osts_crashed: 0
>>> layout_osts_partial: 0
>>> layout_osts_co-failed: 0
>>> layout_osts_co-stopped: 0
>>> layout_osts_co-paused: 0
>>> layout_osts_unknown: 0
>>> layout_repaired: 82358851
>>> namespace_mdts_init: 0
>>> namespace_mdts_scanning-phase1: 1
>>> namespace_mdts_scanning-phase2: 0
>>> namespace_mdts_completed: 0
>>> namespace_mdts_failed: 0
>>> namespace_mdts_stopped: 0
>>> namespace_mdts_paused: 0
>>> namespace_mdts_crashed: 0
>>> namespace_mdts_partial: 0
>>> namespace_mdts_co-failed: 0
>>> namespace_mdts_co-stopped: 0
>>> namespace_mdts_co-paused: 0
>>> namespace_mdts_unknown: 0
>>> namespace_osts_init: 0
>>> namespace_osts_scanning-phase1: 0
>>> namespace_osts_scanning-phase2: 0
>>> namespace_osts_completed: 0
>>> namespace_osts_failed: 0
>>> namespace_osts_stopped: 0
>>> namespace_osts_paused: 0
>>> namespace_osts_crashed: 0
>>> namespace_osts_partial: 0
>>> namespace_osts_co-failed: 0
>>> namespace_osts_co-stopped: 0
>>> namespace_osts_co-paused: 0
>>> namespace_osts_unknown: 0
>>> namespace_repaired: 68265278
>>>
>>> with the layout_repaired and namespace_repaired values ticking up at
>>> about 10000 per second.
>>>
>>> Is the layout_osts_failed value of 30 a concern?
>>>
>>> Is there any way to know how far along it is?
>>>
>>> I am also seeing many messages similar to the following in
>>> /var/log/messages on the mdt and oss with OST0000:
>>>
>>> Sep 25 10:48:00 mds0l210 kernel: LustreError:
>>> 5934:0:(osp_precreate.c:903:osp_precreate_cleanup_orphans())
>>> terra-OST0000-osc-MDT0000: cannot cleanup orphans: rc = -22
>>> Sep 25 10:48:00 mds0l210 kernel: LustreError:
>>> 5934:0:(osp_precreate.c:903:osp_precreate_cleanup_orphans()) Skipped
>>> 599
>>> previous similar messages
>>> Sep 25 10:48:30 mds0l210 kernel: LustreError:
>>> 6137:0:(fld_handler.c:256:fld_server_lookup()) srv-terra-MDT0000:
>>> Cannot
>>> find sequence 0x8: rc = -2
>>> Sep 25 10:48:30 mds0l210 kernel: LustreError:
>>> 6137:0:(fld_handler.c:256:fld_server_lookup()) Skipped 16593 previous
>>> similar messages
>>> Sep 25 10:58:01 mds0l210 kernel: LustreError:
>>> 5934:0:(osp_precreate.c:903:osp_precreate_cleanup_orphans())
>>> terra-OST0000-osc-MDT0000: cannot cleanup orphans: rc = -22
>>> Sep 25 10:58:01 mds0l210 kernel: LustreError:
>>> 5934:0:(osp_precreate.c:903:osp_precreate_cleanup_orphans()) Skipped
>>> 599
>>> previous similar messages
>>> Sep 25 10:58:57 mds0l210 kernel: LustreError:
>>> 6137:0:(fld_handler.c:256:fld_server_lookup()) srv-terra-MDT0000:
>>> Cannot
>>> find sequence 0x8: rc = -2
>>> Sep 25 10:58:57 mds0l210 kernel: LustreError:
>>> 6137:0:(fld_handler.c:256:fld_server_lookup()) Skipped 40309 previous
>>> similar messages
>>>
>>> Do these indicate that the process is not working?
>>>
>>> Regards,
>>> Rodger
>>>
>>>
>>>
>>> On 23/09/2017 15:07, rodger wrote:
>>>> Dear All,
>>>>
>>>> In the process of upgrading 1.8.x to 2.x I've messed up a number of
>>>> the index values for OSTs by running tune2fs with the --index value
>>>> set. To compound matters while trying to get the OSTs to mount I
>>>> erased the last_rcvd files on the OSTs. I'm looking for a way to
>>>> confirm what the index should be for each device. Part of the reason
>>>> for my difficulty is that in the evolution of the filesystem some OSTs
>>>> were decommissioned and so the full set no longer has a sequential set
>>>> of index values. In practicing for the upgrade the trial sets that I
>>>> created did have nice neat sequential indexes and the process I
>>>> developed broke when I used the real data. :-(
>>>>
>>>> The result is that although the lustre filesystem mounts and all
>>>> directories appear to be listed files in directories mostly have
>>>> question marks for attributes and are not available for access. I'm
>>>> assuming this is because the index for the OST holding the file is
>>>> wrong.
>>>>
>>>> Any pointers to recovery would be much appreciated!
>>>>
>>>> Regards,
>>>> Rodger
>>>> _______________________________________________
>>>> lustre-discuss mailing list
>>>> lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>> _______________________________________________
>>> lustre-discuss mailing list
>>> lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>