[Lustre-discuss] write RPC & congestion

Jeremy Filizetti jeremy.filizetti at gmail.com
Mon Aug 23 19:58:24 PDT 2010


>
> > A single RPC request will initiate an RDMA transfer of at most
> "max_pages_per_rpc". where the page unit is Lustre page size 65536. Each
> RDMA transfer is executed in 1MB chunks.  On a given client, if there are
> more than "max_pages_per_rpc" pages of data available to transfer , multiple
> RPCs are issued and multiple RDMA's are initiated.
>
> No, the max_pages_per_rpc is scaled down proportionately for systems with
> large PAGE_SIZE.  This is because the node doesn't know what the PAGE_SIZE
> of the peer is.
>
> There is a patch in bugzilla that does what you propose - submit larger IO
> request RPCs, and do multiple 1MB RDMA xfers per request.  However, this
> showed performance _loss_ in some cases (in particular shared-file IO), and
> the reason for this regression was never diagnosed.
>

The larger RPCs from bug 16900 offered some significant performance when
working over the WAN.  Our use case involves a few clients who need fast
access rather then 100s or 1000s.  The included PDF shows iozone performance
over the WAN in 10 ms RTT increments up to 200ms for a single Lustre client
and a small Lustre setup (1 MDS, 2 OSS, 6 OSTs).  This test was with a SDR
Infiniband WAN connection using Obsidian Longbows to simulate delay.  I'm
not 100% sure the value used is correct for the concurrent_sends.

So even though this isn't geared towards most Lustre users, I think the
larger RPCs is pretty useful.  Plenty of people at LUG2010 mentioned using
Lustre over the WAN in some way.


> > Would it be correct to say: The purpose of the "max_pages_per_rpc"
> parameter is to enable the servers to even out the individual progress of
> concurrent clients with a lot of data to move and more fairly apportion the
> available bandwidth amongst concurrently writing clients?
>
> Yes, partly.  The more important factor is max_rpcs_in_flight, which limits
> the number of requests that a client can submit to each server at one time.
>
> There was a research paper written to have dynamic max_rpcs_in_flight that
> showed performance improvements when there are few clients active, and we'd
> like to include that code into Lustre when it is ready.
>

Was there a patch available of this?


>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Technical Lead
> Oracle Corporation Canada Inc.
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100823/19c3304b/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lustre_perf_with_large_rpcs.pdf
Type: application/pdf
Size: 227991 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100823/19c3304b/attachment.pdf>


More information about the lustre-discuss mailing list