<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
I haven't tried it, but the man page for setstripe --pool explains
it.<br>
<br>
Cheers,<br>
Hans Henrik<br>
<br>
<div class="moz-cite-prefix">On 06.07.2022 22.50, Thomas Roth via
lustre-discuss wrote:<br>
</div>
<blockquote type="cite"
cite="mid:54bbc9ed-8724-bb78-dddd-cb39c109023e@gsi.de">Yes, I got
it.
<br>
But Marion states that they switched
<br>
> to a PFL arrangement, where the first 64k lives on flash
OST's (mounted on our metadata servers), and the remainder of
larger files lives on HDD OST's.
<br>
<br>
So, how do you specify a particular OSTs (or group of OSTs) in a
PFL?
<br>
The OST-equivalent of the "-L mdt" part ?
<br>
<br>
With SSDs and HDDs making up the OSTs, I would have guessed OST
pools, but I'm only aware of a "lfs setstripe" that puts all of my
file into a pool. How to put the first few kB of a file in pool A
and the rest in pool B ?
<br>
<br>
<br>
Cheers
<br>
Thomas
<br>
<br>
<br>
On 7/6/22 21:42, Andreas Dilger wrote:
<br>
<blockquote type="cite">Thomas,
<br>
where the file data is stored depends entirely on the PFL layout
used for the filesystem or parent directory.
<br>
<br>
For DoM files, you need to specify a DoM component, like:
<br>
<br>
lfs setstripe -E 64K -L mdt -E 1G -c 1 -E 16G -c 4 -E eof
-c 32 <dir>
<br>
<br>
so the first 64KB will be put onto the MDT where the file is
created, the remaining 1GB onto a single OST, the next 15GB
striped across 4 OSTs, and the rest of the file striped across
(up to) 32 OSTs.
<br>
<br>
64KB is the minimum DoM component size, but if the files are
smaller (e.g. 3KB) they will only allocate space on the MDT in
multiples of 4KB blocks. However, the default ldiskfs MDT
formatting only leaves about 1 KB of space per inode, which
would quickly run out unless DoM is restricted to specific
directories with small files, or if the MDT is formatted with
enough free space to accommodate this usage. This is less of an
issue with ZFS MDTs, but DoM files will still consume space much
more quickly and reduce the available inode count by a factor of
16-64 more quickly than without DoM.
<br>
<br>
It is strongly recommended to use Lustre 2.15 with DoM to
benefit from the automatic MDT space balancing, otherwise the
MDT usage may become imbalanced if the admin (or users) do not
actively manage the MDT selection for new user/project/job
directories with "lfs mkdir -i".
<br>
<br>
Cheers, Andreas
<br>
<br>
On Jul 6, 2022, at 10:48, Thomas Roth via lustre-discuss
<<a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a><a class="moz-txt-link-rfc2396E" href="mailto:lustre-discuss@lists.lustre.org"><mailto:lustre-discuss@lists.lustre.org></a>>
wrote:
<br>
<br>
Hi Marion,
<br>
<br>
I do not fully understand how to "mount flash OSTs on a metadata
server"
<br>
- You have a couple of SSDs, you assemble these into on block
device and format it with "mkfs.lustre --ost ..." ? And then
mount it just as any other OST?
<br>
- PFL then puts the first 64k on these OSTs and the rest of all
files on the HDD-based OSTs?
<br>
So, no magic on the MDS?
<br>
<br>
I'm asking because we are considering something similar, but we
would not have these flash-OSTs in the MDS-hardware but on
separate OSS servers.
<br>
<br>
<br>
Regards,
<br>
Thomas
<br>
<br>
On 23/02/2022 04.35, Marion Hakanson via lustre-discuss wrote:
<br>
Hi again,
<br>
<a class="moz-txt-link-abbreviated" href="mailto:karagol@aselsan.com.tr">karagol@aselsan.com.tr</a><a class="moz-txt-link-rfc2396E" href="mailto:karagol@aselsan.com.tr"><mailto:karagol@aselsan.com.tr></a>
said:
<br>
I was thinking that DoM is built in feature and it can be
enabled/disabled
<br>
online for a certain directories. What do you mean by reformat
to converting
<br>
to DoM (or away from it). I think just Metadata target size is
important.
<br>
When we first turned on DoM, it's likely that our Lustre system
was old
<br>
enough to need to be reformatted in order to support it. Our
flash
<br>
storage RAID configuration also needed to be expanded, but the
system
<br>
was not yet in production so a reformat was no big deal at the
time.
<br>
So perhaps your system will not be subject to this requirement
(other
<br>
than expanding your MDT flash somehow).
<br>
<a class="moz-txt-link-abbreviated" href="mailto:karagol@aselsan.com.tr">karagol@aselsan.com.tr</a><a class="moz-txt-link-rfc2396E" href="mailto:karagol@aselsan.com.tr"><mailto:karagol@aselsan.com.tr></a>
said:
<br>
I also thought creating flash OST on metadata server. But I was
not sure what
<br>
to install on metadata server for this purpose. Can Metadata
server be an OSS
<br>
server at the same time? If it is possible I would prefer flash
OST on
<br>
Metadata server instead of DoM. Because Our metadata target size
is small, it
<br>
seems I have to do risky operations to expand size.
<br>
Yes, our metadata servers are also OSS's at the same time. The
flash
<br>
OST's are separate volumes (and drives) from the MDT's, so less
scary (:-).
<br>
<a class="moz-txt-link-abbreviated" href="mailto:karagol@aselsan.com.tr">karagol@aselsan.com.tr</a><a class="moz-txt-link-rfc2396E" href="mailto:karagol@aselsan.com.tr"><mailto:karagol@aselsan.com.tr></a>
said:
<br>
imho, because of the less RPC traffic DoM shows more performance
than flash
<br>
OST. Am I right?
<br>
The documentation does say there that using DoM for small files
will produce
<br>
less RPC traffic than using OST's for small files.
<br>
But as I said earlier, for us, the amount of flash needed to
support DoM
<br>
was a lot higher than with the flash OST approach (we have a
high percentage,
<br>
by number, of small files).
<br>
I'll also note that we had a wish to mostly "set and forget" the
layout
<br>
for our Lustre filesystem. We have not figured out a way to
predict
<br>
or control where small files (or large ones) are going to end
up, so
<br>
trying to craft optimal layouts in particular directories for
particular
<br>
file sizes has turned out to not be feasible for us. PFL has
been a
<br>
win for us here, for that reason.
<br>
Our conclusion was that in order to take advantage of the
performance
<br>
improvements of DoM, you need enough money for lots of flash, or
you need
<br>
enough staff time to manage the DoM layouts to fit into that
flash.
<br>
We have neither of those conditions, and we find that using PFL
and
<br>
flash OST's for small files is working very well for us.
<br>
Regards,
<br>
Marion
<br>
From: =?utf-8?B?VGFuZXIgS0FSQUfDlkw=?=
<<a class="moz-txt-link-abbreviated" href="mailto:karagol@aselsan.com.tr">karagol@aselsan.com.tr</a><a class="moz-txt-link-rfc2396E" href="mailto:karagol@aselsan.com.tr"><mailto:karagol@aselsan.com.tr></a>>
<br>
To: Marion Hakanson
<<a class="moz-txt-link-abbreviated" href="mailto:hakansom@ohsu.edu">hakansom@ohsu.edu</a><a class="moz-txt-link-rfc2396E" href="mailto:hakansom@ohsu.edu"><mailto:hakansom@ohsu.edu></a>>
<br>
CC:
"<a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a><a class="moz-txt-link-rfc2396E" href="mailto:lustre-discuss@lists.lustre.org"><mailto:lustre-discuss@lists.lustre.org></a>"
<<a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a><a class="moz-txt-link-rfc2396E" href="mailto:lustre-discuss@lists.lustre.org"><mailto:lustre-discuss@lists.lustre.org></a>>
<br>
Date: Tue, 22 Feb 2022 04:53:03 +0000
<br>
<br>
UNCLASSIFIED
<br>
<br>
Thank you for sharing your experience.
<br>
<br>
I was thinking that DoM is built in feature and it can be
enabled/disabled online for a certain directories. What do you
mean by reformat to converting to DoM (or away from it). I think
just Metadata target size is important.
<br>
<br>
I also thought creating flash OST on metadata server. But I was
not sure what to install on metadata server for this purpose.
Can Metadata server be an OSS server at the same time? If it is
possible I would prefer flash OST on Metadata server instead of
DoM. Because Our metadata target size is small, it seems I have
to do risky operations to expand size.
<br>
<br>
imho, because of the less RPC traffic DoM shows more performance
than flash OST. Am I right?
<br>
<br>
Best Regards;
<br>
<br>
<br>
From: Marion Hakanson
<<a class="moz-txt-link-abbreviated" href="mailto:hakansom@ohsu.edu">hakansom@ohsu.edu</a><a class="moz-txt-link-rfc2396E" href="mailto:hakansom@ohsu.edu"><mailto:hakansom@ohsu.edu></a>>
<br>
Sent: Thursday, February 17, 2022 8:20 PM
<br>
To: Taner KARAGÖL
<<a class="moz-txt-link-abbreviated" href="mailto:karagol@aselsan.com.tr">karagol@aselsan.com.tr</a><a class="moz-txt-link-rfc2396E" href="mailto:karagol@aselsan.com.tr"><mailto:karagol@aselsan.com.tr></a>>
<br>
Cc:
<a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a><a class="moz-txt-link-rfc2396E" href="mailto:lustre-discuss@lists.lustre.org"><mailto:lustre-discuss@lists.lustre.org></a><br>
Subject: Re: [lustre-discuss] How to speed up Lustre
<br>
<br>
We started with DoM on our new Lustre system a couple years ago.
<br>
- Converting to DoM (or away from it) is a full-reformat
operation.
<br>
- DoM uses a fixed amount of metadata space (64k minimum for
us) for every file, even those smaller than 64k.
<br>
<br>
Basically, DoM uses a lot of flash metadata space, more than we
planned for, and more than we could afford.
<br>
<br>
We ended up switching to a PFL arrangement, where the first 64k
lives on flash OST's (mounted on our metadata servers), and the
remainder of larger files lives on HDD OST's. This is working
very well for our small-file workloads, and uses less flash
space than the DoM configuration did.
<br>
<br>
Since you don't already have DoM in effect, it may be possible
that you could add flash OST's, configure a PFL, and then use
"lfs migrate" to re-layout existing files into the new OST's.
Your mileage may vary, so be safe!
<br>
<br>
Regards,
<br>
<br>
Marion
<br>
<br>
<br>
<br>
On Feb 14, 2022, at 03:32, Taner KARAGÖL via lustre-discuss
<<a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a><a class="moz-txt-link-rfc2396E" href="mailto:lustre-discuss@lists.lustre.org"><mailto:lustre-discuss@lists.lustre.org></a><a class="moz-txt-link-rfc2396E" href="mailto:lustre-discuss@lists.lustre.org"><mailto:lustre-discuss@lists.lustre.org></a>>
wrote:
<br>

<br>
UNCLASSIFIED
<br>
<br>
Hi Everybody;
<br>
<br>
We have a performance problem with small files on our HPC system
(120 compute nodes). Our all OSS targets are classic spinning
HDDs. To speed up, I want to configure Data on Metadata. Our
metadata target has SDD disks.
<br>
<br>
Underlying file systems are ZFS (for OSS and Meta)
<br>
Lustre version: 2.12.5
<br>
ZFS version: .0.7.13
<br>
<br>
Our Lustre file system size is 720TB (2 OSS servers, 1 enclosure
with 6 zpools), Metadata file system size is 2.1TB(1 enclosure
and 1 metadata target).
<br>
<br>
What is your opinions to speed up this setup? I want to
configure DoM but I am concerning about Metadata size. My
questions:
<br>
<br>
1. How can I increase Medatadata size? Metadata enclosure
has a empty slots. Is there a way to increase size
online/offline?
<br>
2. Is it possible to migrate big files from DoM to OSS
targets completely? Off course online migration. (So I think I
can free Metadata for new small files).
<br>
<br>
Best Regards;
<br>
Taner
<br>
________________________________
<br>
Dikkat:
<br>
<br>
Bu elektronik posta mesaji kisisel ve ozeldir. Eger size
gonderilmediyse lutfen gondericiyi bilgilendirip mesaji siliniz.
Firmamiza gelen ve giden mesajlar virus taramasindan
gecirilmekte, guvenlik nedeni ile kontrol edilerek
saklanmaktadir. Mesajdaki gorusler ve bakis acisi gondericiye
ait olup Aselsan A.S. resmi gorusu olmak zorunda degildir.
<br>
<br>
________________________________
<br>
Attention:
<br>
<br>
This e-mail message is privileged and confidential. If you are
not the intended recipient please delete the message and notify
the sender. E-mails to and from the company are monitored for
operational reasons and in accordance with lawful business
practices. Any views or opinions presented are solely those of
the author and do not necessarily represent the views of the
company.
<br>
<br>
________________________________
<br>
<br>
<br>
_______________________________________________
<br>
lustre-discuss mailing list
<br>
<a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a><a class="moz-txt-link-rfc2396E" href="mailto:lustre-discuss@lists.lustre.org"><mailto:lustre-discuss@lists.lustre.org></a><a class="moz-txt-link-rfc2396E" href="mailto:lustre-discuss@lists.lustre.org"><mailto:lustre-discuss@lists.lustre.org></a>
<br>
<a class="moz-txt-link-freetext" href="https://urldefense.com/v3/__http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org__;!!Mi0JBg!bW2FnSTRNdX7DpkjIiMayeexmYJ3D5Xt7wtneny2zgGi1ZXPcy7QMRlM3mno-HWR$">https://urldefense.com/v3/__http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org__;!!Mi0JBg!bW2FnSTRNdX7DpkjIiMayeexmYJ3D5Xt7wtneny2zgGi1ZXPcy7QMRlM3mno-HWR$</a><a class="moz-txt-link-rfc2396E" href="https://urldefense.com/v3/__http:/lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org__;!!Mi0JBg!bW2FnSTRNdX7DpkjIiMayeexmYJ3D5Xt7wtneny2zgGi1ZXPcy7QMRlM3mno-HWR$"><https://urldefense.com/v3/__http:/lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org__;!!Mi0JBg!bW2FnSTRNdX7DpkjIiMayeexmYJ3D5Xt7wtneny2zgGi1ZXPcy7QMRlM3mno-HWR$></a>
<br>
<br>
######################################################################
<br>
Dikkat:
<br>
<br>
Bu elektronik posta mesaji kisisel ve ozeldir. Eger size
<br>
gonderilmediyse lutfen gondericiyi bilgilendirip mesaji siliniz.
<br>
Firmamiza gelen ve giden mesajlar virus taramasindan
gecirilmekte,
<br>
guvenlik nedeni ile kontrol edilerek saklanmaktadir. Mesajdaki
<br>
gorusler ve bakis acisi gondericiye ait olup Aselsan A.S. resmi
<br>
gorusu olmak zorunda degildir.
<br>
<br>
######################################################################
<br>
Attention:
<br>
<br>
This e-mail message is privileged and confidential. If you are
<br>
not the intended recipient please delete the message and notify
<br>
the sender. E-mails to and from the company are monitored for
<br>
operational reasons and in accordance with lawful business
practices.
<br>
Any views or opinions presented are solely those of the author
and
<br>
do not necessarily represent the views of the company.
<br>
<br>
######################################################################
<br>
<br>
_______________________________________________
<br>
lustre-discuss mailing list
<br>
<a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a><a class="moz-txt-link-rfc2396E" href="mailto:lustre-discuss@lists.lustre.org"><mailto:lustre-discuss@lists.lustre.org></a>
<br>
<a class="moz-txt-link-freetext" href="http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org">http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org</a>
<br>
<br>
--
<br>
--------------------------------------------------------------------
<br>
Thomas Roth
<br>
Department: Informationstechnologie
<br>
Location: SB3 2.291
<br>
<br>
<br>
GSI Helmholtzzentrum für Schwerionenforschung GmbH
<br>
Planckstraße 1, 64291 Darmstadt, Germany,
<a class="moz-txt-link-abbreviated" href="http://www.gsi.de">www.gsi.de</a><a class="moz-txt-link-rfc2396E" href="http://www.gsi.de"><http://www.gsi.de></a>
<br>
<br>
Commercial Register / Handelsregister: Amtsgericht Darmstadt,
HRB 1528
<br>
Managing Directors / Geschäftsführung:
<br>
Professor Dr. Paolo Giubellino, Dr. Ulrich Breuer, Jörg Blaurock
<br>
Chairman of the Supervisory Board / Vorsitzender des
GSI-Aufsichtsrats:
<br>
State Secretary / Staatssekretär Dr. Volkmar Dietz
<br>
<br>
_______________________________________________
<br>
lustre-discuss mailing list
<br>
<a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a><a class="moz-txt-link-rfc2396E" href="mailto:lustre-discuss@lists.lustre.org"><mailto:lustre-discuss@lists.lustre.org></a>
<br>
<a class="moz-txt-link-freetext" href="http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org">http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org</a>
<br>
<br>
Cheers, Andreas
<br>
--
<br>
Andreas Dilger
<br>
Lustre Principal Architect
<br>
Whamcloud
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
</blockquote>
_______________________________________________
<br>
lustre-discuss mailing list
<br>
<a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a>
<br>
<a class="moz-txt-link-freetext" href="http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org">http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org</a>
<br>
</blockquote>
<br>
</body>
</html>