[lustre-discuss] a question of balance

Sat Dec 12 09:40:41 PST 2015

On 2015/12/11, 10:29, "lustre-discuss on behalf of John White"
<lustre-discuss-bounces at lists.lustre.org on behalf of jwhite at lbl.gov>
wrote:

>Alright, so the rule is always balance luns.  Capacity and performance
>should be uniform across all luns or you¹ll run into unpredictable or
>inefficient IO patterns.
>
>What is the current state of that rule in relation to Lustre?  I have an
>existing file system, lives on a DDN 12k.  3TB drives, 8+2p, very common
>config.  We¹re looking to grow that FS and bossman keeps asking if we
>really need to stick with those 3TB spindles or if we could go with the
>nice pricing we¹re seeing for 4TB and beyond.  This obviously makes me
>cringe, but I see two options (both involve wiping the existing FS,
>regardless):
>
>-Just do it - Set up the 2 arrays 8+2p, have 24TB luns and 32TB luns and
>let lustre weighted allocation kick in when it feels it should.
>
>-2 storage pools in the same namespace.  Set up the namespace with 2
>primary directories, a pool for each.  Deal with the insanely annoying
>job of allocating user data to each, deal with the horror of imbalanced
>workloads manually placed on each.
>
>I¹d love someone to say the first option is just peachy these days but I
>suspect it¹s still the muddy, murky freakshow people always warn against
>(especially when you start hitting that 75% fs capacity mark).
>
>Comments?  Screeds?  Insults?  I¹d love to hear some insight here.

The Lustre OST load balancing has not yet been reimplemented to what I
would like it to be, but is at least functional.  If you are adding new
OSTs, performance usually won't be worse than what you had before, but
since the balancing is based on weighted random probabilities it can still
misbehave at times.  I've definitely seen systems out there that have
different OST sizes, and it usually helps if you are adding a number of
new OSTs at the same time, instead of just one or two.

If you disable the space balancing entirely, then your 32TB LUNs the
rebalancing will take a long time unless you start actively finding unused
large files to migrate, and you will still have a bunch of unused space
until the rest of your OSTs also grow that large.  This wouldn't be the
end of the world either.

If you are really are keen on trying to fix this code yourself, you could
look at the discussion at
https://bugzilla.lustre.org/show_bug.cgi?id=18547 and
https://jira.hpdd.intel.com/browse/LU-9 to take a crack at it.

Cheers, Andreas
-- 
Andreas Dilger

Lustre Principal Architect
Intel High Performance Data Division