[Lustre-discuss] Lustre Storage Sizing- How?

Deval kulshrestha deval.kulshrestha at progression.com
Fri Jan 8 22:16:51 PST 2010


Dear Atul

Thanks for your valuable inputs, this also helped me understand the
fundamental of Lustre storage sizing.
I really appreciate your help

Thanks and Regards
Deval


-----Original Message-----
From: Atul.Vidwansa at Sun.COM [mailto:Atul.Vidwansa at Sun.COM] 
Sent: Friday, January 08, 2010 3:09 PM
To: Deval kulshrestha
Cc: lustre-discuss at lists.lustre.org
Subject: Re: [Lustre-discuss] Lustre Storage Sizing- How?

Hi Deval,

Lustre storage sizing is largely driven by:
* Capacity required
* Performance required
* Type of workload

Lustre 1.8.1.1 has a limit of 8 TB for an individual OST. Lets say you 
are using SATA disks for OST. A Seagate enterprise 1TB SATA disk can do 
around 90 MB/Sec with 1 MB blocksize using dd (can go upto 110 MB/Sec if 
blocksize is really large). Assuming that you are looking for RAID6 
protection for OST, you need 10 SATA disks to form a 8 TB lun.

You will need 4 such OSTs to give you 32 TB unformatted space.

Lets consider performance:

Ideally, you should get 720 MB/Sec/OST [ 90 MB/sec/disk X 8 data disks 
in (8+2) RAID6 set]. But you have to cater for overhead of 
software/hardware RAID and limits of SAS PCIe HCA (or FC hardware RAID 
HCA).  A 4gbps FC HCA tops out at 500 MB/Sec so you need 5-6 FC HCAs to 
utilize storage bandwidth of 4 RAID6 OSTs [Total bandwidth = 4 X 720 
MB/Sec/OST = 2.8 GB/Sec].

So, now you have a storage system that delivers 32 TB unformatted space 
and 2.8 GB/Sec of performance for large sequential read/write workload. 
If you are planning to have mixed or small io workload and still want to 
achieve 2 GB/Sec throughput, you have to double the specs. Small, random 
IO (think of home directories) kills storage performance.

Lets size MDS now.

There is no direct relation between size of OST and that of MDT. MDTs 
are purely based on number of files required. It is a good idea to use 
FC or SAS disks for MDS as they spin at higher rate and have better IOPS 
performance.  For example, lets consider Seagate enterprise 15 K rpm 300 
GB SAS disks. You can put 4 such SAS disks in RAID10 configuration for 
MDT which will give you 600 GB of unformatted space.

Lustre needs 4 KB of metadata for each file created, so you can store 
about 150 Million files in 600 GB MDT.  In reality, this number would be 
much smaller depending on your average file size [no of files = total 
size of OST/average file size].

Hope this helps.

Cheers,
_Atul

Deval kulshrestha wrote:
> Hi
> I am considering a new storage of 30 TB usable space with a 2 GB/s
sustained
> read write performance in clustered mode. But not able to figure out
sizing
> part of it like what OSS, what OST and what MDS.
> Urgent help would be highly appreciable 
>
> Thanks and Regards
> Deval


===========================================================
Privileged or confidential information may be contained
in this message. If you are not the addressee indicated
in this message (or responsible for delivery of the 
message to such person), please delete this message and
kindly notify the sender by an emailed reply. Opinions,
conclusions and other information in this message that
do not relate to the official business of Progression
and its associate entities shall be understood as neither
given nor endorsed by them.
  

-----------------------------------------------------------------------
Progression Infonet Private Limited, Gurgaon (Haryana), India
Authorised dealer of PostMaster, by QuantumLink Communications Pvt. Ltd
Get your free copy of PostMaster at http://www.postmaster.co.in/






More information about the lustre-discuss mailing list