[lustre-discuss] Stripe size for osts
andreas.dilger at intel.com
Mon Mar 21 18:23:32 PDT 2016
On Mar 21, 2016, at 15:50, Pawel Dziekonski <dzieko at wcss.pl> wrote:
>> On pon, 21 mar 2016 at 09:24:02 +0000, Dilger, Andreas wrote:
>>> On 2016/03/18, 12:52, "Kurt Strosahl" <strosahl at jlab.org> wrote:
>>> Good Afternoon,
>>> I'm experimenting with ost configurations geared more towards small
>>> files and operations on those small files (like source code, and
>>> compiling), and I was wondering about changing the stripe size so that
>>> small files fit more efficiently on an ost. I believe that would be the
>>> --param lov.stripesize=XX option for mkfs.lustre, is that correct? And
>>> is there a lower limit that I should know about?
>> Just to clarify, the stripe size for Lustre is not a property of the OST,
>> but rather a property of each file. The OST itself allocates space
>> internally as it sees fit. For ldiskfs space allocation is done in units
>> of 4KB blocks managed in extents, while ZFS has variable block sizes (512
>> bytes up to 1MB or more, but only one block size per file) managed in a
>> tree. In both cases, if a file is sparse then no blocks are allocated for
>> the holes in the file.
>> As for the minimum stripe size, this is 64KB, since it isn't possible to
>> have a stripe size below the PAGE_SIZE on the client, and some
>> architectures (e.g. IA64, PowerPC, Alpha) allowed 64KB PAGE_SIZE.
>> For small files, the stripe_size parameter is virtually meaningless, since
>> the data will never exceed a single stripe in size. What is much more
>> important is to use a stripe_count=1, so that the client doesn't have to
>> query multiple OSTs to determine the file size, timestamps, and other
> default stripe size is 1MB. Is there a reason for that?
Yes, because the underlying RAID hardware is usually configured with RAID-6 8+2 1MB stripe width, so 1MB RPCs writes avoid read-modify-write, and 1MB reads ensure that the reads align properly with the allocation size that was used by the filesystem when the data was written.
More information about the lustre-discuss