[Lustre-discuss] stripe offset and hot-spots

Andreas Dilger adilger at sun.com
Wed Dec 30 14:41:58 PST 2009

On 2009-12-29, at 13:51, Christopher J. Morrone wrote:
> Andreas Dilger wrote:
>> Well, that is already the default, unless it has been changed at  
>> some  time in the past by someone at your site.  We generally  
>> recommend  against ever changing the starting index of files, since  
>> there are  rarely good reasons to change this.  The man page writes:
>>         A start-ost of -1 allows the MDS to choose the starting
>>         index and it is strongly recommended, as this allows
>>         space and load balancing to be done by the MDS as needed.
> The Lustre Manual should be updated to use that wording.  It still  
> says "random":
> http://manual.lustre.org/manual/LustreManual18_HTML/StripingAndIOOptions.html#50532485_78664
> Also, it lists the default stripe-count as 1 when it should be 2.

AFAIK, the default stripe count is still 1.  Is it possible you've  
changed this default locally?

> Also, he might want to be aware that that round-robin is only used  
> if no two OSTs are imbalanced by more than 20%.  Otherwise, the  
> weighted allocator kicks in:
> http://manual.lustre.org/manual/LustreManual18_HTML/StripingAndIOOptions.html#50532485_pgfId-1293986


> We haven't had time to look into it very closely yet, but we have  
> been getting complaints from users that seem to be a result of the  
> weighted allocator.  It appears to not be uncommon for OSTs to get  
> more than 20% out of balance on our systems, so the weighted  
> allocator is in use fairly frequently.
> The users are complaining of reduced filesystem bandwidth, and we  
> suspect the weighted allocator.  It results in the users' files  
> being quite unevenly distributed among the OSTs.  Obviously, this is  
> done purposely, with files more likely to be created on OSTs that  
> have more free space.  But it also results in an unbalanced  
> distribution of files, and therefore poor bandwidth.
> We would probably prefer a simpler algorithm.  Possibly just stop  
> creating new files on any OST that is 20% more full, and round-robin  
> over the remaining osts.
> Like I said, we haven't had time to look into it too closely, so we  
> don't have a bug open yet.  But it is something to keep in mind.

There is bug 18547 that is open to track the development of an  
improved QOS-RR allocator.  The goal is to always use round-robin  
allocation, but selectively skip OSTs that are too full.  There would  
no longer be a "QOS mode" per-se, it would always be active, hopefully  
avoiding imbalances gently as soon as they appear, rather than letting  
them get too far out of balance.

Cheers, Andreas
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

More information about the lustre-discuss mailing list