[Lustre-discuss] RDMA limitation?
Liang Zhen
Zhen.Liang at Sun.COM
Wed Apr 14 21:23:43 PDT 2010
Jiahua wrote:
> Sorry to send it again! Can anyone help?
>
> Jiahua
>
>
> On Tue, Apr 13, 2010 at 10:45 PM, Jiahua <jiahua at gmail.com> wrote:
>
>> Thanks for your answers! More questions:
>>
>> * Do you only lock for writes? What if I only read? Do you still lock
>> even for simultaneous reads?
>>
"lock" here is synchronization of operating system, not dlm lock.
>> * Is the limitation system wide or just in one server? That is, can I
>> improve the performance by adding more OSS or OST?
>>
SMP improvements are for performance of handling small RPCs, so it's
mostly for metadata performance or I/O performance on NUMA system, it's
about how to fully drive machines, not about scalability of whole cluster.
>> * By RPC bouncing, are you talking about the Linux storage stack? It
>> is not inherent to Lustre, right?
>>
it is about lustre stack.
>> Thanks,
>> Jiahua
>>
>>
>> On Tue, Apr 13, 2010 at 8:43 PM, Liang Zhen <Zhen.Liang at sun.com> wrote:
>>
>>> It's a kind of story like: "if you have to take dozens of global locks on
>>> lifetime of a RPC, then the code can't scale well on large SMP system, not
>>> matter what kind of network you are using”, so the problem is scattered
>>> everywhere.
>>> Also, we are trying to reduce RPC bounce between CPUs, in current code, a
>>> request can be received by CPU A, then queued on CPU B, processed by CPU C,
>>> and replied by CPU D, it's very bad on large SMP system because of data
>>> traffic between CPUs.
>>>
>>> Regards
>>> Liang
>>>
>>> Jiahua wrote:
>>>
>>>> You mean it is inherent in the code? Can you point me to the actual
>>>> code if possible? I am just curious why. Any pointers or hints will be
>>>> appreciated.
>>>>
>>>> Thanks,
>>>> Jiahua
>>>>
>>>>
>>>> On Tue, Apr 13, 2010 at 6:46 PM, Kevin Van Maren <Kevin.Vanmaren at sun.com>
>>>> wrote:
>>>>
>>>>
>>>>> Yes, the RPC rate is limited by Lustre code locking to that rate, even
>>>>> with
>>>>> rdma.
>>>>>
>>>>> Kevin
>>>>>
>>>>>
>>>>> On Apr 13, 2010, at 5:08 PM, Jiahua <jiahua at gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> This is kind of a followup question of the thread "One or two OSS, no
>>>>>> difference?" last month. In that thread, Andreas stated:
>>>>>>
>>>>>> "There is work currently underway to improve the SMP scaling
>>>>>> performance for the RPC handling layer in Lustre. Currently that
>>>>>> limits the delivered RPC rate to 10-15k/sec or so."
>>>>>>
>>>>>> My question is: is the limitation also applied to RDMA on IB? By SMP,
>>>>>> I guess Andreas was talking about CPU, right? Since RDMA can bypass
>>>>>> the host CPU, does it mean it can also bypass the limitation?
>>>>>>
>>>>>> Thanks,
>>>>>> Jiahua
>>>>>> _______________________________________________
>>>>>> Lustre-discuss mailing list
>>>>>> Lustre-discuss at lists.lustre.org
>>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>>>
>>>>>>
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> Lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>
>>>>
>>>
More information about the lustre-discuss
mailing list