[Lustre-discuss] mds reports that OST is full (ENOSPC error -28 ) but df tells different

Brian J. Murrell Brian.Murrell at Sun.COM
Thu Jan 22 08:02:47 PST 2009


On Thu, 2009-01-22 at 15:44 +0000, Wojciech Turek wrote:
> Hello,

Hi,

> Lustre MDS report following error:
> Jan 22 15:20:40 mds01.beowulf.cluster kernel: LustreError:
> 24680:0:(lov_request.c:692:lov_update_create_set()) error creating fid
> 0xeb79c9d sub-object on OST idx 4/1: rc = -28

-28 is ENOSPC.
 
> Which I translate as that one of the OST (index 4/1) is full and has
> no space left on device.

Yes.

> OSS seem to be consistent and says:
> Jan 22 15:21:15 storage08.beowulf.cluster kernel: LustreError:
> 23507:0:(filter_io_26.c:721:filter_commitrw_write()) error starting
> transaction: rc = -30

Hrm.  I'm not sure a -30 (EROFS) would translate to a -28 to the MDS.  I
think it would also be a -30.  So are you sure you are looking at
correlating messages?  The timestamps, if the two nodes are in sync also
seem to indicate a lack of correlation with 35s of disparity.

Perhaps there is an actual -28 in the OSS log prior to the -30 one?

> Which  I translate as Client would like to write to an existing file
> but it can't because file system is read only.

Indeed.  But why is it read-only?  There should be an event in the OSS
log saying that it was turning the filesystem read-only.

> The OST device is still mounted with rw option

Yeah.  That's just the state at mount time.  Lustre will set a device
read-only in the case of filesystem errors, as one example.


> Now the main question is why Lustre thinks that OST(idx4) is full?

No, I think the main question is why is it read-only.  The full
situation may have been transient where it filled up momentarily and
then some objects were removed.  In any case, this is a secondary issue
and really only need be considered once the read-only situation is
understood.

> Is it possible that this OST have meny orphaned objects which takes
> all the available space?

That would be reflected in the df.  If you suspect there may be orphan
objects though, you could lfsck to verify and clean.

> Is there a way of reclaiming back this free space?

If you mean orphaned OST objects, then lfsck.

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20090122/d90d212a/attachment.pgp>


More information about the lustre-discuss mailing list