[Lustre-discuss] RDMA limitation?
Zhen.Liang at Sun.COM
Tue Apr 13 20:43:21 PDT 2010
It's a kind of story like: "if you have to take dozens of global locks
on lifetime of a RPC, then the code can't scale well on large SMP
system, not matter what kind of network you are using”, so the problem
is scattered everywhere.
Also, we are trying to reduce RPC bounce between CPUs, in current code,
a request can be received by CPU A, then queued on CPU B, processed by
CPU C, and replied by CPU D, it's very bad on large SMP system because
of data traffic between CPUs.
> You mean it is inherent in the code? Can you point me to the actual
> code if possible? I am just curious why. Any pointers or hints will be
> On Tue, Apr 13, 2010 at 6:46 PM, Kevin Van Maren <Kevin.Vanmaren at sun.com> wrote:
>> Yes, the RPC rate is limited by Lustre code locking to that rate, even with
>> On Apr 13, 2010, at 5:08 PM, Jiahua <jiahua at gmail.com> wrote:
>>> Hi all,
>>> This is kind of a followup question of the thread "One or two OSS, no
>>> difference?" last month. In that thread, Andreas stated:
>>> "There is work currently underway to improve the SMP scaling
>>> performance for the RPC handling layer in Lustre. Currently that
>>> limits the delivered RPC rate to 10-15k/sec or so."
>>> My question is: is the limitation also applied to RDMA on IB? By SMP,
>>> I guess Andreas was talking about CPU, right? Since RDMA can bypass
>>> the host CPU, does it mean it can also bypass the limitation?
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
More information about the lustre-discuss