[Lustre-discuss] stripe offset and hot-spots

Christopher J. Morrone morrone2 at llnl.gov
Tue Dec 29 12:51:05 PST 2009


Andreas Dilger wrote:

> Well, that is already the default, unless it has been changed at some  
> time in the past by someone at your site.  We generally recommend  
> against ever changing the starting index of files, since there are  
> rarely good reasons to change this.  The man page writes:
> 
>          A start-ost of -1 allows the MDS to choose the starting
>          index and it is strongly recommended, as this allows
>          space and load balancing to be done by the MDS as needed.

The Lustre Manual should be updated to use that wording.  It still says 
"random":

http://manual.lustre.org/manual/LustreManual18_HTML/StripingAndIOOptions.html#50532485_78664

Also, it lists the default stripe-count as 1 when it should be 2.

Also, he might want to be aware that that round-robin is only used if no 
two OSTs are imbalanced by more than 20%.  Otherwise, the weighted 
allocator kicks in:

http://manual.lustre.org/manual/LustreManual18_HTML/StripingAndIOOptions.html#50532485_pgfId-1293986

We haven't had time to look into it very closely yet, but we have been 
getting complaints from users that seem to be a result of the weighted 
allocator.  It appears to not be uncommon for OSTs to get more than 20% 
out of balance on our systems, so the weighted allocator is in use 
fairly frequently.

The users are complaining of reduced filesystem bandwidth, and we 
suspect the weighted allocator.  It results in the users' files being 
quite unevenly distributed among the OSTs.  Obviously, this is done 
purposely, with files more likely to be created on OSTs that have more 
free space.  But it also results in an unbalanced distribution of files, 
and therefore poor bandwidth.

We would probably prefer a simpler algorithm.  Possibly just stop 
creating new files on any OST that is 20% more full, and round-robin 
over the remaining osts.

Like I said, we haven't had time to look into it too closely, so we 
don't have a bug open yet.  But it is something to keep in mind.

Chris



More information about the lustre-discuss mailing list