[Lustre-discuss] Limits on OSTs per OSS?
Sébastien Buisson
sebastien.buisson at bull.net
Wed Aug 19 07:39:40 PDT 2009
Hi,
To me:
12 OSTs x 1.2 GB = 14.4 GB < 16GB
So you are clearly in the recommendation.
Cheers,
Sebastien.
Ms. Megan Larko a écrit :
> Responding to what Sebastien has written:
>> Hi,
>
>> Just a small feedback from our own experience.
>> I agree with Brian about the fact that there is no strong limit on the
>> number of OSTs per OSS in the Lustre code. But one should really take
>> into account the available memory on OSSes when defining the number of
>> OSTs per OSS (and so the size of each OST). If you do not have 1GB or
>> 1.2 GB of memory per OST on your OSSes, you will run into serious
> t>rouble with "out of memory" messages.
>
>> For instance, if you want 8 OSTs per OSS, your OSSes should have at
>> least 10GB of RAM.
>
>> Unfortunately we experienced those "out of memory" problems, so I advise
>> you to read Lustre Operations Manual chapter 33.12 "OSS RAM Size for a
>> Single OST".
>
>> Cheers,
>> Sebastien.
>
> We have one OSS running Lustre 2.6.18-53.1.13.el5_lustre.1.6.4.3smp.
> This OSS has 16Gb RAM for 76Tb of formatted Lustre disk space.
>
> [root at oss4 ~]# cat /proc/meminfo
> MemTotal: 16439360 kB
> MemFree: 88204 kB
>
> Client sees: ic-mds1 at o2ib:/crew8 Total Usable Space 76Tb
>
> The OSS has 6 JBODS, each of which is partitioned in two parts to stay
> below the Lustre 8Tb per partition limit.
> /dev/sdb1 6.3T 3.8T 2.3T 63% /srv/lustre/OST/crew8-OST0000
> /dev/sdb2 6.3T 3.7T 2.3T 62% /srv/lustre/OST/crew8-OST0001
> /dev/sdc1 6.3T 3.8T 2.3T 63% /srv/lustre/OST/crew8-OST0002
> /dev/sdc2 6.3T 3.8T 2.2T 64% /srv/lustre/OST/crew8-OST0003
> /dev/sdd1 6.3T 3.8T 2.2T 64% /srv/lustre/OST/crew8-OST0004
> /dev/sdd2 6.3T 4.2T 1.8T 70% /srv/lustre/OST/crew8-OST0005
> /dev/sdi1 6.3T 4.3T 1.8T 71% /srv/lustre/OST/crew8-OST0006
> /dev/sdi2 6.3T 3.8T 2.2T 64% /srv/lustre/OST/crew8-OST0007
> /dev/sdj1 6.3T 3.8T 2.3T 63% /srv/lustre/OST/crew8-OST0008
> /dev/sdj2 6.3T 3.8T 2.2T 63% /srv/lustre/OST/crew8-OST0009
> /dev/sdk1 6.3T 3.7T 2.3T 62% /srv/lustre/OST/crew8-OST0010
> /dev/sdk2 6.3T 3.7T 2.3T 63% /srv/lustre/OST/crew8-OST0011
>
> As you can see, this is no where near the recommendation of 1Gb of RAM
> per OST. Yes, we do occasionally, under load, see kernel panics due
> to, we believe, insufficient memory and swap. These panics occur
> approximately once per month. We also see watchdog messages stating
> "swap page allocation failure" messages sometimes a day prior to
> kernel panic. After this Lustre disk was up and running was I then
> enlightened that this was too much load for a single OSS. Ah well,
> live and learn. I am planning to split this one large group across
> two OSSes in the next month. Hopefully the kernel panics and
> watchdog errors will go away with the disk OST load shared across two
> OSS machines.
>
> Just one real life scenario for your consideration.
>
> megan
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
More information about the lustre-discuss
mailing list