[lustre-discuss] Cannot remove file

Patrick Farrell paf at cray.com
Thu May 28 06:51:24 PDT 2015


No - you'll have to take the file system offline to use debugfs in write mode rather than read only.

You could also mount as ldiskfs at that point.

This is most likely corruption on the MDT, but it can be Lustre level corruption, such as bad FID information, rather than damage to the actual file system structures.  IE, at the ldiskfs layer, the structures are all sound, but some of the higher level info they contain is correct.  So fsck may not help you, but I'd suggest running it just in case to check for lower level corruption.

But, you'll probably have to delete the objects in question by hand on the MDT.  This will orphan any OST objects corresponding to those files.  (They're sort of already orphaned by the corruption, though.)

Also, the fact that it is the entire name of the file which is ?s rather than just the permission bits and owner suggests it is not a UID/GID issue.
________________________________________
From: lustre-discuss [lustre-discuss-bounces at lists.lustre.org] on behalf of Martin Hecht [hecht at hlrs.de]
Sent: Thursday, May 28, 2015 7:19 AM
To: Jon Tegner
Cc: Lustre discussion
Subject: Re: [lustre-discuss] Cannot remove file

Hi Jon,

it might be an option to use debugfs for manually "repairing" the
directory so that the file behaves "normal" again, but maybe someone
else can answer if this is possible online, and if so, how the procedure
looks like exactly.

best regards,
Martin

On 05/28/2015 01:43 PM, Jon Tegner wrote:
> Thanks!
>
> Tried the first suggestions, but cant bring the system down at the
> moment, so have to wait with the ldiskfs-level.
>
> Any other suggestions not involving stopping it?
>
> Best Regards,
>
> /jon
>
> On 05/28/2015 01:13 PM, Martin Hecht wrote:
>> hi,
>>
>> if the file name starts with a dash, you should prepend "./" to it when
>> calling unlink. I'm not sure if it works with these broken files, which
>> don't even have a proper name anymore.
>> Also, you probably have to escape the question marks so that they don't
>> act as wildcard in the shell (or better enclose it in single quotes):
>> unlink './-?????????'
>>
>> If this doesn't work, you could try to stop lustre and mount the MDT as
>> ldiskfs and remove the entries on that level.
>>
>> lfsck is supposed to fix this online, too, but it doesn't work in 2.5 if
>> I recall correctly.
>>
>> best regards,
>> Martin
>>
>> On 05/28/2015 12:46 PM, Jon Tegner wrote:
>>> Hi,
>>>
>>> I have a few files which are listed (ls -l) with:
>>>
>>> "-????????? ? ?      ?             ?            ?"
>>>
>>> I have tried to remove them, both with "rm" and with "ulink", but
>>> neither of these work (unlink: cannot unlink `file': Invalid
>>> argument). This really doesn't bother me, except for an error message
>>> when backing up - but it still would be nice to know how to fix this.
>>>
>>> We are using lustre-2.5.3 on CentOS-6.5 boxes.
>>>
>>> Thanks!
>>>
>>> /jon
>>> _______________________________________________
>>> lustre-discuss mailing list
>>> lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


More information about the lustre-discuss mailing list