[Lustre-discuss] ost's reporting full

Robin Humble robin.humble+lustre at anu.edu.au
Sat Sep 11 02:27:32 PDT 2010


Hey Dr Stu,

On Sat, Sep 11, 2010 at 04:27:43PM +0800, Stuart Midgley wrote:
>We are getting jobs that fail due to no space left on device.
>BUT none of our lustre servers are full (as reported by lfs df -h on a client and by df -h on the oss's).
>They are all close to being full, but are not actually full (still have ~300gb of space left)

sounds like a grant problem.

>I've tried playing around with tune2fs -m {0,1,2,3} and tune2fs -r 1024 etc and nothing appears to help.
>Anyone have a similar problem?  We are running 1.8.3

there are a couple of grant leaks that are fixed in 1.8.4 eg.
  https://bugzilla.lustre.org/show_bug.cgi?id=22755
or see the 1.8.4 release notes.

however the overall grant revoking problem is still unresolved AFAICT
  https://bugzilla.lustre.org/show_bug.cgi?id=12069
and you'll hit that issue more frequently with many clients and small
OSTs, or when any OST starts getting full.

in your case 300g per OST should be enough headroom unless you have
~4k clients now (assuming 32-64m grants per client), so it's probably
grant leaks. there's a recipe for adding up client grants and comparing
them to server grants to see if they've gone wrong in bz 22755.

cheers,
robin



More information about the lustre-discuss mailing list