[Lustre-discuss] Limits on OSTs per OSS?
Ms. Megan Larko
dobsonunit at gmail.com
Wed Aug 19 07:44:35 PDT 2009
Greetings Sebastien,
On Wed, Aug 19, 2009 at 10:39 AM, Sébastien
Buisson<sebastien.buisson at bull.net> wrote:
> Hi,
>
> To me:
> 12 OSTs x 1.2 GB = 14.4 GB < 16GB
>
> So you are clearly in the recommendation.
I thought I would be with in the spec *if* my OSTs were smaller units.
As they are JBODs in sections of 6+ Tb each, I though I was
"coloring outside the lines".
Thanks,
megan
>
> Cheers,
> Sebastien.
>
>
> Ms. Megan Larko a écrit :
>>
>> Responding to what Sebastien has written:
>>>
>>> Hi,
>>
>>> Just a small feedback from our own experience.
>>> I agree with Brian about the fact that there is no strong limit on the
>>> number of OSTs per OSS in the Lustre code. But one should really take
>>> into account the available memory on OSSes when defining the number of
>>> OSTs per OSS (and so the size of each OST). If you do not have 1GB or
>>> 1.2 GB of memory per OST on your OSSes, you will run into serious
>>
>> t>rouble with "out of memory" messages.
>>
>>> For instance, if you want 8 OSTs per OSS, your OSSes should have at
>>> least 10GB of RAM.
>>
>>> Unfortunately we experienced those "out of memory" problems, so I advise
>>> you to read Lustre Operations Manual chapter 33.12 "OSS RAM Size for a
>>> Single OST".
>>
>>> Cheers,
>>> Sebastien.
>>
>> We have one OSS running Lustre 2.6.18-53.1.13.el5_lustre.1.6.4.3smp.
>> This OSS has 16Gb RAM for 76Tb of formatted Lustre disk space.
>>
>> [root at oss4 ~]# cat /proc/meminfo
>> MemTotal: 16439360 kB
>> MemFree: 88204 kB
>>
>> Client sees: ic-mds1 at o2ib:/crew8 Total Usable Space 76Tb
>>
>> The OSS has 6 JBODS, each of which is partitioned in two parts to stay
>> below the Lustre 8Tb per partition limit.
>> /dev/sdb1 6.3T 3.8T 2.3T 63% /srv/lustre/OST/crew8-OST0000
>> /dev/sdb2 6.3T 3.7T 2.3T 62% /srv/lustre/OST/crew8-OST0001
>> /dev/sdc1 6.3T 3.8T 2.3T 63% /srv/lustre/OST/crew8-OST0002
>> /dev/sdc2 6.3T 3.8T 2.2T 64% /srv/lustre/OST/crew8-OST0003
>> /dev/sdd1 6.3T 3.8T 2.2T 64% /srv/lustre/OST/crew8-OST0004
>> /dev/sdd2 6.3T 4.2T 1.8T 70% /srv/lustre/OST/crew8-OST0005
>> /dev/sdi1 6.3T 4.3T 1.8T 71% /srv/lustre/OST/crew8-OST0006
>> /dev/sdi2 6.3T 3.8T 2.2T 64% /srv/lustre/OST/crew8-OST0007
>> /dev/sdj1 6.3T 3.8T 2.3T 63% /srv/lustre/OST/crew8-OST0008
>> /dev/sdj2 6.3T 3.8T 2.2T 63% /srv/lustre/OST/crew8-OST0009
>> /dev/sdk1 6.3T 3.7T 2.3T 62% /srv/lustre/OST/crew8-OST0010
>> /dev/sdk2 6.3T 3.7T 2.3T 63% /srv/lustre/OST/crew8-OST0011
>>
>> As you can see, this is no where near the recommendation of 1Gb of RAM
>> per OST. Yes, we do occasionally, under load, see kernel panics due
>> to, we believe, insufficient memory and swap. These panics occur
>> approximately once per month. We also see watchdog messages stating
>> "swap page allocation failure" messages sometimes a day prior to
>> kernel panic. After this Lustre disk was up and running was I then
>> enlightened that this was too much load for a single OSS. Ah well,
>> live and learn. I am planning to split this one large group across
>> two OSSes in the next month. Hopefully the kernel panics and
>> watchdog errors will go away with the disk OST load shared across two
>> OSS machines.
>>
>> Just one real life scenario for your consideration.
>>
>> megan
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>
More information about the lustre-discuss
mailing list