<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p>All,</p>
<p>I am looking for a more complete understanding of how the two
settings qos_prio_free and qos_threshold_rr function together.</p>
<p>My current understanding, which may be inaccurate, is the
following:</p>
<p><b>qos_prio_free</b><b><br>
</b><br>
This setting controls how much Lustre prioritizes free space
(versus location for the sake of performance) in allocation.<br>
The higher this number, the more Lustre takes empty space on an
OST into consideration for its allocation.<br>
When set to 100%, Lustre uses ONLY empty space as the deciding
factor for writes.<br>
</p>
<p><b>qos_threshold_rr</b><b><br>
</b><br>
This setting controls how much consideration should be given to
QoS in allocation<br>
The higher this number, the more QOS is taken into consideration.<br>
When set to 100%, Lustre ignores the QoS variable and hits all
OSTs equally</p>
<p>I'm looking for several answers:</p>
<p>1) Is my basic understanding of the above settings correct?</p>
<p>2) How does lustre deal with OSTs that are 100% full? I'm curious
about this under two conditions.</p>
<p>2a) When you set qos_threshold_rr=100 -- meaning, go and hit all
the OSTs the same amount.</p>
<p>On one of our 2.5.3 lustre filesystems, the allocator is not
working (a known bug, but why it seems to be behaving fine on the
other one, I couldn't say...) and so we have configured
qos_threshold_rr=100. Since our OSTs are pretty dramatically
unbalanced, it has happened that attempts to write to full OSTs
have caused write failures. Data deletes have gotten us below 90%
on all OSTs now, and while I can certainly take the fullest OSTs
them out of write mode if that is needed, it would seem to me that
lustre should, no matter what your qos_threshold_rr setting, treat
OSTs that are 100% full differently, meaning, it should no longer
attempt to write to them. In short, this seems like a bug to me...
although, granted, I suppose if you are overriding the allocator,
it's caveat user at that point. <br>
</p>
<p>2b) When you set qos_threshold_rr != 100 -- meaning, the
allocator is working<br>
</p>
On the other lustre 2.5.3 system, the system defaults
(qos_prio_free=91%; qos_threshold_rr=17%) are hitting all the OSTs
when I run my test*, so I have not changed them. Several of the OSTs
in this file system are at 100%. I get that we are not seeing write
failures because the allocator is not allocating to these OSTs as
frequently, based on how full they are. But I know from my test that
these OSTs are still in the mix... so that implies to me that it
would be possible, although less likely, to see a write failure if a
write stream is opened on one of the 100% OSTs. I'd love to be able
to quantify that "less likely".<br>
<br>
Basically, I guess my question is: is taking an OST out of write
mode the only (or best) way of preventing the fs from attempting to
write to it when it is nearly full?<br>
<br>
Thanks,<br>
Jessica<br>
<br>
<p>------------------------------<br>
</p>
<p>*To test file allocation on your lustre system, you can use this
one-liner from a lustre client. USE IT IN ITS OWN, NEW DIRECTORY!</p>
<p>touch t.{1..2000}; lfs getstripe t.*|fgrep -A1 obdidx|fgrep -v
obdidx|fgrep -v -- --|awk '{ print $1 }'|sort|uniq -c; rm -f t.*<br>
</p>
<br>
<pre class="moz-signature" cols="72">--
Jessica Otey
System Administrator II
North American ALMA Science Center (NAASC)
National Radio Astronomy Observatory (NRAO)
Charlottesville, Virginia (USA)</pre>
</body>
</html>