[Lustre-discuss] write RPC & congestion
burlen
burlen.loring at gmail.com
Tue Aug 17 13:15:18 PDT 2010
Hi, thanks for previous help.
I have some question about Lustre RPC and the sequence of events that
occur during large concurrent write() involving many processes and large
data size per process. I understand there is a mechanism of flow
control by credits, but I'm a little unclear on how it works in general
after reading the "networking & io protocol" white paper.
Is it true that a write() RPC transfer's data in chunks of at least 1MB
and at most (max_pages_per_rpc*page_size) Bytes, where page_size=2^16 ?
I can use the bounds to estimate the number of RPCs issued per MB of
data to write?
About how many concurrent incoming write() RPC per OSS service thread
can a single server handle before it stops responding to incoming RPCs ?
What happens to an RPC when the server is too busy to handle it, is it
even issued by the client ? Does the client have to poll and/or resend
the RPC ? Does the process of polling for flow control credits add
significant network/server congestion ?
Is it likely that a large number of RPC's/flow control credit requests
will induce enough network congestion so that client's RPC's timeout ?
How does the client handle such a timeout ?
Burlen
More information about the lustre-discuss
mailing list