[Lustre-discuss] mkfs options/tuning for RAID based OSTs

Paul Nowoczynski pauln at psc.edu
Tue Oct 19 14:43:15 PDT 2010


Ed,
Does 'segment size' refer to the amount of data written to each disk 
before proceeding to the next disk (e.g. stride)?  This is my guess 
since these values are usually powers of two and therefore 52KB 
[512KB/(10 data disks)] is probably not the stride size.  In any event I 
think you'll get the most bang for your buck by creating raid stripe 
where n_data_disks * stride = 1MB.  My recent experience when dealing 
with our software raid6 systems here is that elimination of 
read-modify-write is key for achieving good performance.  I would 
recommend exploring configurations where the the number of data disks is 
a power of 2 so that you can configure the stripe size to be 1MB.  I 
wouldn't be surprised if you see better performance by dividing the 12 
disks in 2x(4+2) raid6 luns. 
paul

Edward Walter wrote:
> Hello All,
>
> We're doing a fresh Lustre 1.8.4 install using Sun StorageTek 2540 
> arrays for our OST targets.  We've configured these as RAID6 with no 
> spares which means we have the equivalent of 10 data disks and 2 parity 
> disks in play on each OST.
>
> We configured the "Segment Size" on these arrays at 512 KB.  I believe 
> this is equivalent to the "chunk size" in the Lustre operations manual 
> (section 10.1.1).  Based on the formulae in the manual: in order to have 
> my stripe width fall below 1MB; I need to reconfigure my "Segment Size" 
> like this:
>
> Segment Size <= 1024KB/(12-2) = 102.4 KB
> so 16KB, 32KB or 64KB are optimal values
> Does this seem right?
>
> Do I really need to do this (reinitialize the arrays/volumes) to get my 
> Segment Size below 1MB?  What impact will/won't this have on performance?
>
> When I format the OST filesystem; I need to provide options for both 
> stripe and stride.  The manual indicates that the units for these values 
> are 4096-byte (4KB) blocks.  Given that, I should use something like:
>
> -E stride= (one of)
>     16KB/4KB = 4
>     32KB/4KB = 8
>     64KB/4KB = 16
>
> stripe= (one of)
>     16KB*10/4KB = 40
>     32KB*10/4KB = 80
>     64KB*10/4KB = 160
>
> so for example I would issue the following:
> mkfs.lustre --mountfsoptions="stripe=160" --mkfsoptions="-E stride=16 -m 
> 1" ...
>
> Is it better for to opt for the higher values or lower values here?
>
> Also, does anyone have recommendations for "aligning" the filesystem so 
> that the fs blocks align with the RAID chunks?  We've done things like 
> this for SSD drives.  We'd normally give Lustre the entire RAID device 
> (without partitions) so this hasn't been an issue in the past.  For this 
> installation though; we're creating multiple volumes (for size/space 
> reasons) so partitioning is a necessary evil now.
>
> Thanks for any feedback!
>
> -Ed Walter
> Carnegie Mellon University
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss




More information about the lustre-discuss mailing list