[lustre-discuss] possible to read orphan ost objects on live filesystem?

Chris Hunter chris.hunter at yale.edu
Thu Sep 10 17:54:16 PDT 2015


Hi,
We experienced file corruption on several OSTs. We proceeded through 
recovery using e2fsck & ll_recover_lost_found_obj tools.
Following these steps, e2fsck came out clean.

The file corruption did not impact the MDT. The files were still 
referenced by the MDT. Accessing the file on a lustre client (ie. ls -l) 
would report error "Cannot allocate memory"

Following OST recovery steps, we started removing the corrupt files via 
"unlink" command on lustre client (rm command would not remove file).

Now dry-run e2fsck of the OST is reporting errors:
"deleted/unused inodes" in Pass 2 (checking directory structure), 
"Unattached inodes" in Pass 4(checking reference counts)
"free block count wrong" in Pass 5 (checking group summary information).

Is e2fsck errors expected when unlinking files ?

thanks,
chris hunter
chris.hunter at yale.edu


On 09/03/2015 12:54 PM, Martin Hecht wrote:
> Hi Chris,
>
> On 09/02/2015 07:18 AM, Chris Hunter wrote:
>> Hi Andreas
>>
>> On 09/01/2015 07:22 PM, Dilger, Andreas wrote:
>>> On 2015/09/01, 7:59 AM, "lustre-discuss on behalf of Chris Hunter"
>>> <lustre-discuss-bounces at lists.lustre.org on behalf of
>>> chris.hunter at yale.edu> wrote:
>>>
>>>> Hi Andreas,
>>>> Thanks for your help.
>>>>
>>>> If you have a striped lustre file with "holes" (ie. one chunk is gone
>>>> due hardware failure, etc.) are the remaining file chunks considered
>>>> orphan objects ?
>> So when a lustre striped file has a hole (eg. missing chunk due to
>> hardware failure), the remaining file chunks stay indefinitely on the
>> OSTs.
>> Is there a way to reclaim the space occupied by these pieces (after
>> recovery of any usuable data, etc.)?
> these remaining chunks still belong to the file (i.e. you have the
> metadata entry on the MDT and you see the file when lustre is mounted).
> By removing the file you free up the space.
>
> In general there are two types of inconsistencies which may occur:
> Orphan objects are objects which are NOT assigned to an entry on the
> MDT, i.e. chunks which do not belong to any file. These can be either
> pre-allocated chunks or chunks left over after a corruption of the
> metadata on the MDT.
>
> The other type of corruption is that you have a file, where chunks are
> missing in-between. This can happen, when an OST gets corrupted. As long
> as the MDT is Ok, you should be able to remove such a file. If in
> addition the MDT is also corrupted, you should first fix the MDT, and
> you might then only be able to unlink the file (which again might leave
> some orphan objects on the OSTs). lfsck should be able to remove them,
> depending on the lustre version you are running...
>
> Another point: When the OST got corrupted, after having them repaired
> with e2fsck, you can mount them as ldiskfs and see if there are chunks
> in lost+found and use the tool ll_recover_lost_found_objs to restore
> them in the original place. I believe these objects which e2fsck puts in
> lost+found are another kind of thing, usually not called "orphan
> objects". As I said, they usually can be easily recovered.
>
> Martin
>
>


More information about the lustre-discuss mailing list