[Lustre-discuss] OSS Service Thread Count
Oleg Drokin
Oleg.Drokin at Sun.COM
Sun Jan 25 21:01:21 PST 2009
Hello!
On Jan 25, 2009, at 6:56 PM, Wojciech Turek wrote:
> For my particular case it gives 512 ost_num_threads which is the
> Lustre
> max number for this particular parameter. Manual says that each thread
> uses actually 1.5MB of RAM, so 768MB of RAM will be consumed on each
> of
> my OSSs for I/O threads.
> So I guess with 16GB of RAM the initial (default) value of the
> ost_num_threads is already being set to 512, is that correct?
> I know that adding more OSSs and OSTs might help but at the moment
> this
> isn't an option for me.
> Is there any other way I could lower down high load on the OSSs? Can
> tuning client side help?
To decrease the load you actually want to decrease the number of OST
threads
(ost_num_threads module parameter to ost.ko module).
Essentially what is happening is your drives are only able to sustain
certain
amount of parallel i/o activity before degrading the performance due
to all the
seeking going on. Ideally you need to set the number of ost threads to
this
number, but this is complicated by the fact that different workloads
(as in
i/o sizes) result in different parallel streams the drives can handle.
Anyway after you reach that point of congestion the performance only
goes
downhill, the threads just wait for i/o and contribute to your LA
figures.
You need to experiment a bit to see what number of threads makes sense
for you.
Perhaps start with number of threads equal to number of actual disk
spindles
you have on that node (if you use raid5+, subtract any dead spindles
not used
for actual data (e.g. 1/3 of spindles for raid5)) and and watch the
performance
of the clients during usual workloads (not LA on OSSes, it won't go
much higher
than the max_threads you'd specify), if you feel the performance
degraded,
try increasing thread count somewhat and see how that works until
performance
starts degrading again or until you reach satisfactory performance.
If your disk configuration does not have writeback cache enabled and
your
activity is mostly writes, you might also want to give patch from bug
16919
a try, it removes synchronous journal commit requirements and
therefore should
somewhat speedup OST writes in this case (unless you already use fast
external
journal, or unless there is a write cache enabled that somewhat
mitigates the
synchronousness of journal commit right now).
Hope that helps.
Bye,
Oleg
More information about the lustre-discuss
mailing list