[lustre-discuss] kernel threads for rpcs in flight
Anna Fuchs
anna.fuchs at uni-hamburg.de
Sun Apr 28 15:54:34 PDT 2024
Hello everyone.
The setting |max_rpcs_in_flight| affects, among other things, how many
threads can be spawned simultaneously for processing the RPCs, right?
In tests where the network is clearly a bottleneck, this setting has
almost no effect - the network cannot keep up with processing the data,
there is not so much to do in parallel.
With a faster network, the stats show higher CPU utilization on
different cores (at least on the client).
What is the exact mechanism by which it is decided that a kernel thread
is spawned for processing a bulk? Is there an RPC queue with timings or
something similar?
Is it in any way predictable or calculable how many threads a specific
workload will require (spawn if possible) given the data rates from the
network and storage devices?
With |max_||rpcs_in_flight = 1|, multiple cores are loaded, presumably
alternately, but the statistics are too inaccurate to capture this.
The distribution of threads to cores is regulated by the Linux kernel,
right? Does anyone have experience with what happens when all CPUs are
under full load with the application or something else?
Do the Lustre threads suffer? Is there a prioritization of the Lustre
threads over other tasks?
Are there readily available statistics or tools for this scenario?
Thanks a lot
Anna
--
Anna Fuchs
Universität Hamburg
Department of Computer Science
Research Group Scientific Computing
Bundesstraße 45a
D-20146 Hamburg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20240429/a1fd0f40/attachment.htm>
More information about the lustre-discuss
mailing list