[Lustre-discuss] quota problems in lustre 1.8

Robert LeBlanc robert at leblancnet.us
Wed Jul 1 07:59:54 PDT 2009


How did you solve this, we will be implementing quotas on our system soon
and don't want to fall into the same trap.

Thanks,

Robert LeBlanc
Life Sciences & Undergraduate Education Computer Support
Brigham Young University


On Wed, Jul 1, 2009 at 5:53 AM, Giacinto Donvito <
giacinto.donvito at ba.infn.it> wrote:

> Thank you Zhiyong,
>
> with this hint I was able to find a way to solve the problem.
>
> Cheers,
> Giacinto
>
> -- -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Giacinto Donvito    LIBI -- EGEE3 SA1 INFN - Bari ITALY
> ------------------------------------------------------------------
> giacinto.donvito at ba.infn.it                   | GTalk/GMail:
> donvito.giacinto at gmail.com
> tel. +39 080 5443244   Fax  +39 0805442470    | Skype: giacinto_it
> VOIP:  +41225481596           | MSN: donvito.giacinto at hotmail.it
> AIM/iChat: gdonvito1                          | Yahoo: eric1_it
> ------------------------------------------------------------------
> Life is something that everyone should try at least once.
>    Henry J. Tillman
>
>
>
>
>
> Il giorno 01/lug/09, alle ore 12:24, Zhiyong Landen tian ha scritto:
>
>
>
>>  Hi All,
>>>
>>> I'm experiencing some problem in a test installation of lustre 1.8.0.X
>>>
>>> The installation is composed by one server hosting the MDS, and two
>>> servers hosting the OSTs.
>>> One of the servers has 12x2.7TB devices and the other has 16x2.3TB
>>> devices.
>>>
>>> All the devices were configured with:
>>>
>>> "tunefs.lustre --ost --mgsnode=lustre01 at tcp0 --param ost.quota_type=ug
>>> --writeconf /dev/sdxx"
>>>
>>> on the admin node I issued the "lfs quotacheck -ug /lustre" (I see read
>>> operation occurring on the both disk servers) that ends without error.
>>>
>>> I was able to set-up quotas per user on the admin node and it seems
>>> successfully registered by checking with: "lfs quota -u donvito /lustre"
>>>
>>> The problem that I see is that it is possible for a user to overfill the
>>> quota as the two server behave differently: one of the two deny writing
>>> while the other not.
>>> I tried with both lustre rpms and vanilla (2.6.22) patched kernel and the
>>> result is the same. It is not related to the physical server as both of them
>>> sometimes has the same behaviour (But only one of the server at the time). I
>>> have tried with both 1.8.0 and 1.8.0.1 and the same behaviour is observed.
>>>
>>> As you can see the system is correctly accounting the used space but the
>>> server do not deny writing:
>>>
>>> [root at lustre01 ~]# lfs quota -u donvito /lustre
>>> Disk quotas for user donvito (uid 501):
>>>    Filesystem  kbytes   quota   limit   grace   files   quota   limit
>>> grace
>>>       /lustre 124927244* 12000000 15000000              24     100
>>> 100
>>>
>>
>> (hint)You can use "lfs quota -v ..." to get how much quota grant every
>> quota slave has.
>>
>>
>>> The "messages" log on the MDS and both the OSS servers are clean and
>>> nothing strange is noticeable.
>>> Do you have any ideas of where I can look in order to understand the
>>> problem?
>>>
>> It is expected by the current lquota design of lustre.  For lustre quota,
>> there are two kinds of roles: quota master(mds) and quota slaves(osts). When
>> you set quota, the limitation is recorded on quota master. When data is
>> written on osts,  osts will get some quota grant from quota master if
>> remained quota on osts isn't enough. But, at the same time, osts will get
>> some kinds of "quota grant cache" so that quota slaves won't ask quota from
>> quota master every time when they write(if so, performance will be hurted).
>> Then every quota slave will judge if the request it received will trigger
>> out of quota based upon the grant quota it got from mds _respectively_. Then
>> you get what you saw.
>>
>>> Do you have any other suggestion of tests that I can do?
>>>
>> For causes of performance, lquota of lustre is distributed and isn't
>> exactly like local quota(e.g. ext3). So what you saw is normal to lquota,
>> currently you can only change you application to adapt it if it hurt you
>>
>>>
>>> Thank you very much.
>>>
>>> Best Regards,
>>> Giacinto
>>>
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> Giacinto Donvito    LIBI -- EGEE2 SA1 INFN - Bari ITALY
>>> ------------------------------------------------------------------
>>> giacinto.donvito at ba.infn.it <mailto:giacinto.donvito at ba.infn.it>
>>>           | GTalk/GMail: donvito.giacinto at gmail.com <mailto:
>>> donvito.giacinto at gmail.com>
>>> tel. +39 080 5443244   Fax  +39 0805442470  VOIP:  +41225481596   | MSN:
>>> donvito.giacinto at hotmail.it <mailto:donvito.giacinto at hotmail.it>
>>> Skype: giacinto_it | AIM/iChat: gdonvito1 | Yahoo: eric1_it
>>> ------------------------------------------------------------------
>>> "A simple design always takes less time to finish than a complex one.
>>> So always do the simplest thing that could possibly work."
>>> Don Wells at www.extremeprogramming.org <mailto:
>>> Wells at www.extremeprogramming.org>
>>>
>>> "Writing about music is like dancing about architecture." - Frank Zappa <
>>> http://feedproxy.google.com/%7Er/randomquotes/%7E3/G2PjcLJ0ONI/>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>>
>>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20090701/69458655/attachment.htm>


More information about the lustre-discuss mailing list