[lustre-discuss] Mixed size OST's

E.S. Rosenberg esr+lustre at mail.hebrew.edu
Tue Mar 20 12:46:01 PDT 2018


Doesn't PFL also 'solve'/mitigate this issue in the sense that a file
doesn't have to remain restricted to the OST(s) it started on?
(And as such balancing will even continue as files grow)
Regards,
Eli

On Fri, Mar 16, 2018 at 9:57 PM, Dilger, Andreas <andreas.dilger at intel.com>
wrote:

> On Mar 15, 2018, at 09:48, Steve Thompson <smt at vgersoft.com> wrote:
> >
> > Lustre newbie here (1 month). Lustre 2.10.3, CentOS 7.4, ZFS 0.7.5. All
> networking is 10 GbE.
> >
> > I am building a test Lustre filesystem. So far, I have two OSS's, each
> with 30 disks of 2 TB each, all in a single zpool per OSS. Everything works
> well, and was suprisingly easy to build. Thus, two OST's of 60 TB each.
> File types are comprised of home directories. Clients number about 225 HPC
> systems (about 2400 cores).
> >
> > In about a month, I will have a third OSS available, and about a month
> after that, a fourth. Each of these two systems has 48 disks of 4 TB each.
> I am looking for advice on how best to configure this. If I go with one OST
> per system (one zpool comprising 8 x 6 RAIDZ2 vdevs), I will have a lustre
> f/s comprised of two 60 TB OST's and two 192 TB OST's (minus RAIDZ2
> overhead). This is obviously a big mismatch between OST sizes. I have not
> encountered any discussion of the effect of mixing disparate OST sizes. I
> could instead format two 96 TB OST's on each system (two zpools of 4 x 6
> RAIDZ2 vdevs), or three 64 TB OST's, and so on. More OST's means more
> striping possibilities, but less vdev's per zpool impacts ZFS performance
> negatively. More OST's per OSS does not help with network bandwidth to the
> OSS. How would you go about this?
>
> This is a little bit tricky.  Lustre itself can handle different OST sizes,
> as it will run in "QOS allocator" mode (essentially "Quantity of Space",
> the
> full "Quality of Service" was not implemented).  This balances file
> allocation
> across OSTs based on percentage of free space, at the expense of
> performance
> being lower as the only the two new OSTs would be used for 192/252 ~= 75%
> of the files, since it isn't possible to *also* use all the OSTs evenly at
> the
> same time (assuming that network speed is your bottleneck, and not disk
> speed).
>
> For home directory usage this may not be a significant issue. This
> performance
> imbalance would balance out as the larger OSTs became more full, and would
> not
> be seen when files are striped across all OSTs.
>
> I also thought about creating 3x OSTs per new OSS, so they would all be
> about
> the same size and allocated equally.  That means the new OSS nodes would
> see
> about 3x as much IO traffic as the old ones, especially for files striped
> over
> all OSTs.  The drawback here is that the performance imbalance would stay
> forever, so in the long run I don't think this is as good as just having a
> single larger OST.  This will also become less of a factor as more OSTs are
> added to the filesystem and/or you eventually upgrade the initial OSTs to
> have larger disks and/or more VDEVs.
>
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Intel Corporation
>
>
>
>
>
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180320/f6eea7ab/attachment.html>


More information about the lustre-discuss mailing list