[Lustre-discuss] understanding max_rpc_in_flight

Thu Mar 12 00:41:13 PDT 2015

On 2015/03/11, 8:15 PM, "teng wang" <tzw0019 at gmail.com<mailto:tzw0019 at gmail.com>> wrote:

Dear All,

Is there any concrete explanation of max_rpc_in_flight?
Most definitions online simply interpret it as the name suggests.
Can any one explain it from the I/O path?
For example, when the client wants to write, it sends an RPC
for the write to the object server. The server then pulls
the data with RDMA and commit to disks. During this process,
only 1 RPC is issued.
What's the situation for 'max_rpc_in_flight'?
Could any one explain it more concretely?

It is exactly as you describe above.  With the default max_rpc_in_flight = 8, up to 8 RPCs can be sent from the client to _each_ OST without any reply before the client will block the sending of more RPCs.  Only when an RPC completes the RDMA and gets the RPC reply back can a new RPC be sent.

Whether there are 8x concurrent RDMAs between the client and server depends on the server load, queue depth, arrival time of the RPC, etc.  The RPC processing can also be managed to some extent on the server with the "Network Request Scheduler" (NRS).

Cheers, Andreas