[Lustre-discuss] stripe offset and hot-spots
Christopher J. Morrone
morrone2 at llnl.gov
Tue Dec 29 12:51:05 PST 2009
Andreas Dilger wrote:
> Well, that is already the default, unless it has been changed at some
> time in the past by someone at your site. We generally recommend
> against ever changing the starting index of files, since there are
> rarely good reasons to change this. The man page writes:
>
> A start-ost of -1 allows the MDS to choose the starting
> index and it is strongly recommended, as this allows
> space and load balancing to be done by the MDS as needed.
The Lustre Manual should be updated to use that wording. It still says
"random":
http://manual.lustre.org/manual/LustreManual18_HTML/StripingAndIOOptions.html#50532485_78664
Also, it lists the default stripe-count as 1 when it should be 2.
Also, he might want to be aware that that round-robin is only used if no
two OSTs are imbalanced by more than 20%. Otherwise, the weighted
allocator kicks in:
http://manual.lustre.org/manual/LustreManual18_HTML/StripingAndIOOptions.html#50532485_pgfId-1293986
We haven't had time to look into it very closely yet, but we have been
getting complaints from users that seem to be a result of the weighted
allocator. It appears to not be uncommon for OSTs to get more than 20%
out of balance on our systems, so the weighted allocator is in use
fairly frequently.
The users are complaining of reduced filesystem bandwidth, and we
suspect the weighted allocator. It results in the users' files being
quite unevenly distributed among the OSTs. Obviously, this is done
purposely, with files more likely to be created on OSTs that have more
free space. But it also results in an unbalanced distribution of files,
and therefore poor bandwidth.
We would probably prefer a simpler algorithm. Possibly just stop
creating new files on any OST that is 20% more full, and round-robin
over the remaining osts.
Like I said, we haven't had time to look into it too closely, so we
don't have a bug open yet. But it is something to keep in mind.
Chris
More information about the lustre-discuss
mailing list