<div class="gmail_quote"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="im">

> A single RPC request will initiate an RDMA transfer of at most "max_pages_per_rpc". where the page unit is Lustre page size 65536. Each RDMA transfer is executed in 1MB chunks.  On a given client, if there are more than "max_pages_per_rpc" pages of data available to transfer , multiple RPCs are issued and multiple RDMA's are initiated.<br>


<br>

</div>No, the max_pages_per_rpc is scaled down proportionately for systems with large PAGE_SIZE.  This is because the node doesn't know what the PAGE_SIZE of the peer is.<br>

<br>

There is a patch in bugzilla that does what you propose - submit larger IO request RPCs, and do multiple 1MB RDMA xfers per request.  However, this showed performance _loss_ in some cases (in particular shared-file IO), and the reason for this regression was never diagnosed.<br>

</blockquote><div><br>The larger RPCs from bug 16900 offered some significant performance when working over the WAN.  Our use case involves a few clients who need fast access rather then 100s or 1000s.  The included PDF shows iozone performance over the WAN in 10 ms RTT increments up to 200ms for a single Lustre client and a small Lustre setup (1 MDS, 2 OSS, 6 OSTs).  This test was with a SDR Infiniband WAN connection using Obsidian Longbows to simulate delay.  I'm not 100% sure the value used is correct for the concurrent_sends.<br>

<br>So even though this isn't geared towards most Lustre users, I think the larger RPCs is pretty useful.  Plenty of people at LUG2010 mentioned using Lustre over the WAN in some way.<br><br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">


<div class="im"><br>

> Would it be correct to say: The purpose of the "max_pages_per_rpc" parameter is to enable the servers to even out the individual progress of concurrent clients with a lot of data to move and more fairly apportion the available bandwidth amongst concurrently writing clients?<br>


<br>

</div>Yes, partly.  The more important factor is max_rpcs_in_flight, which limits the number of requests that a client can submit to each server at one time.<br>

<br>

There was a research paper written to have dynamic max_rpcs_in_flight that showed performance improvements when there are few clients active, and we'd like to include that code into Lustre when it is ready.<br></blockquote>

<div><br>Was there a patch available of this?<br> <br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

<div class="im"><br>

Cheers, Andreas<br>

--<br>

Andreas Dilger<br>

Lustre Technical Lead<br>

Oracle Corporation Canada Inc.<br>

<br>

_______________________________________________<br>

</div><div><div></div><div class="h5">Lustre-discuss mailing list<br>

<a href="mailto:Lustre-discuss@lists.lustre.org">Lustre-discuss@lists.lustre.org</a><br>

<a href="http://lists.lustre.org/mailman/listinfo/lustre-discuss" target="_blank">http://lists.lustre.org/mailman/listinfo/lustre-discuss</a><br>

</div></div></blockquote></div><br>