[Lustre-discuss] What's the human translation for: ost_write operation failed with -28
johann at whamcloud.com
Wed Dec 7 01:09:36 PST 2011
On Wed, Dec 07, 2011 at 02:05:28PM +1000, Thomas Guthmann wrote:
> >>> FYI, I have the following values on the OSS it couldn't connect/write to :
> >>> obdfilter.foobar-OST0003.tot_granted=17429659648
> >>> obdfilter.foobar-OST0004.tot_granted=13648875520
> >>> obdfilter.foobar-OST0005.tot_granted=18136141824
> > By default, one single OSC should not own more than 32MB of grant space. With 18GB of total granted space, you should have ~560 clients. How many clients are mounting the filesystem?
> Don't fret... 5 clients :)
Then there is a grant leak. Could you please run "lctl get_param osc.*.cur_grant_bytes" on all the clients? BTW, do run the same version of lustre (i think you mentioned 1.8.5) on all the nodes?
In any case, you can try to unmount/remount the OSTs to work around the problem.
> >>> But, again, my application was writing into sparse files so the space was
> >>> already allocated... and the sparse files haven't grown.
> > Lustre (like most filesystems) does not allocate blocks for "holes" in sparse files.
> Hmm, what do you mean ?
My point is just that writing to a hole in a sparse file is not any different than writing at the end of a file and increasing its size. In both cases we have to allocate blocks and the write can fail with ENOSPC.
More information about the lustre-discuss