[Lustre-discuss] Quota Info lost

Thomas Roth t.roth at gsi.de
Wed Dec 17 06:26:49 PST 2008


Hi all,

we somehow lost our quota info - a check of the quotas of a user gave me
 > user quotas are not enabled
This system, running Lustre 1.6.5.1, was set up end of October, by which 
time I had also enabled quotas there. Of course I had also run some 
tests then which showed that quotas were actually working. Since then, 
neither the hardware nor the MGS-MDT-OST setup were changed, i.e. no new 
OSTs added or similar.
So my question is: what may cause the quota info to get lost?

Of course, I had some problems with quota with a particular User-Id on 
this system, which I reported to this list earlier. But these should not 
lead to a complete loss three weeks later?

In addition I verified the strong warning of the Lustre manual against 
running "lfs setquota" on a running system. I did just that when I saw 
that apparently quotas were not enabled.
However, I did not get as far as having 'inaccurate statistic 
information' as indicated in the manual. Instead, it seems I caused a 
delay and timeout on the connection between the acting MDS and its 
slave: we are running a HA+DRBD pair for MGS/MDS, with a dedicated 1Gbit 
link for the DRBD mirroring. This link did not get through all its 
pings, in consequence causing a stonith and takeover of the slave.

I promised I won't do such bad things again, but still would like to 
know: Was this just a coincidence or can such an ill-timed "lfs 
setquota" cause a temporary overload or whatever of the MGS such that 
the poor DRBD gets confused? (It was the drbd-ping to the MGS partition 
that was lost).

Many thanks,
Thomas




More information about the lustre-discuss mailing list