[Lustre-discuss] EXTERNAL: Re: LNET Performance Issue

Jeremy Filizetti jeremy.filizetti at gmail.com
Mon Feb 20 17:36:41 PST 2012


It does seem extreme for data center IB latency but it may not be in the
data center.  The LNet write should take 2 RTT latencies, and 3 for reads
so you could double/triple those times plus any overhead.

Carl can you clarify if you are using QDR IB and/or any campus or wide area
IB extenders?

Jeremy

On Mon, Feb 20, 2012 at 8:14 PM, Kevin Van Maren <KVanMaren at fusionio.com>wrote:

> While it's possible the default credits (8 as I recall) is not enough for
> peak performance, it seems to me that something else is wrong:
> Each 1MB RPC should take ~300uS (based on MPI/IB xfer rates of 3.2+ GB/s),
> so that means there is another 400uS overhead per RPC that is not masked
> with 8 concurrent RPCs, in addition to the overhead masked when he
> increased concurrency.  This is crazy, with a 1uS network latency.
>
> Unless the RPCs are are being broken into tiny chunks or something -- does
> lnet do single-page xfers and not use a rendezvous protocol for full-sized
> RPCs?  It definitely seems that something is broken when o2iblnd gets ~1/3
> of the MPI BW, given that lnd was designed for high-speed xfers.
>
> The max_rpcs_in_flight normally needs tweaking to improve disk
> concurrency, where a single client needs to drive a high queue depth. Still
> finding it hard to believe 8 1MB concurrent RPCs can't handle the network.
>
> Kevin
>
>
> On Feb 20, 2012, at 5:44 PM, "Jeremy Filizetti" <<jeremy.filizetti at gmail.com><jeremy.filizetti at gmail.com>
> jeremy.filizetti at gmail.com> wrote:
>
> Am I reading your earlier post correctly that you have a single server
> acting as the MDS and OSS?  Have you changed your peer_credits and credits
> for ko2iblnd kernel module on the server and client?  You also mentioned
> changing osc.*.max_dirty_mb, you probably need to adjust
> osc.*.max_rpcs_in_flight as well.  Can you post your rpc stats "lctl
> get_param osc.*.rpc_stats"?  I would guess they are bunching up around 7-8
> if your running with the default max_rpcs_in_flight=8.
>
> Jeremy
>
>
> On Mon, Feb 20, 2012 at 4:59 PM, Barberi, Carl E <<carl.e.barberi at lmco.com><carl.e.barberi at lmco.com><carl.e.barberi at lmco.com>
> carl.e.barberi at lmco.com> wrote:
>
>>  Thank you.   This did help.  With the concurrency set to 16, I was able
>> to get a max write speed of 1138 MB/s.  Any ideas on how we can make that
>> faster, though?  Ideally, we’d like to get to 1.5 GB/s.****
>>
>> ** **
>>
>> Carl****
>>
>> ** **
>>
>> *From:* Liang Zhen [mailto: <liang at whamcloud.com> <liang at whamcloud.com><liang at whamcloud.com>
>> liang at whamcloud.com]
>> *Sent:* Thursday, February 16, 2012 1:45 AM
>> *To:* Barberi, Carl E
>> *Cc:* ' <lustre-discuss at lists.Lustre.org><lustre-discuss at lists.Lustre.org><lustre-discuss at lists.Lustre.org>
>> lustre-discuss at lists.Lustre.org'
>> *Subject:* EXTERNAL: Re: [Lustre-discuss] LNET Performance Issue****
>>
>> ** **
>>
>> Hi, I assume you are using "size=1M" for brw test right? performance
>> could increase if you set "concurrency" while adding brw test, i.e:
>> --concurrency=16****
>>
>> ** **
>>
>> Liang****
>>
>> ** **
>>
>> On Feb 16, 2012, at 3:30 AM, Barberi, Carl E wrote:****
>>
>>
>>
>> ****
>>
>> We are having issues with LNET performance over Infiniband.  We have a
>> configuration with a single MDT and six (6) OSTs.  The Lustre client I am
>> using to test is configured to use 6 stripes (lfs setstripe -c  6
>> /mnt/lustre).  When I perform a test using the following command:****
>>
>>  ****
>>
>>                 dd if=/dev/zero of=/mnt/lustre/test.dat bs=1M count=2000*
>> ***
>>
>>  ****
>>
>> I typically get a write rate of about 815 MB/s, and we never exceed 848
>> MB/s.  When I run obdfilter-survey, we easily get about 3-4GB/s write
>> speed, but when I run a series of lnet-selftests, the read and write rates
>> range from 850MB/s – 875MB/s max.  I have performed the following
>> optimizations to increase the data rate:****
>>
>>  ****
>>
>> On the Client:****
>>
>> lctl set_param osc.*.checksums=0****
>>
>> lctl set_param osc.*.max_dirty_mb=256****
>>
>>  ****
>>
>> On the OSTs****
>>
>> lctl set_param obdfilter.*.writethrough_cache_enable=0****
>>
>> lctl set_param obdfilter.*.read_cache_enable=0****
>>
>>  ****
>>
>> echo 4096 > /sys/block/<devices>/queue/nr_requests****
>>
>>  ****
>>
>> I have also loaded the ib_sdp module, which also brought an increase in
>> speed.  However, we need to be able to record at no less than 1GB/s, which
>> we cannot achieve right now.  Any thoughts on how I can optimize LNET,
>> which clearly seems to be the bottleneck?****
>>
>>  ****
>>
>> Thank you for any help you can provide,****
>>
>> Carl Barberi****
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>>  <Lustre-discuss at lists.lustre.org> <Lustre-discuss at lists.lustre.org><Lustre-discuss at lists.lustre.org>
>> Lustre-discuss at lists.lustre.org
>>  <http://lists.lustre.org/mailman/listinfo/lustre-discuss><http://lists.lustre.org/mailman/listinfo/lustre-discuss><http://lists.lustre.org/mailman/listinfo/lustre-discuss>
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss****
>>
>> ** **
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>>  <Lustre-discuss at lists.lustre.org> <Lustre-discuss at lists.lustre.org><Lustre-discuss at lists.lustre.org>
>> Lustre-discuss at lists.lustre.org
>>  <http://lists.lustre.org/mailman/listinfo/lustre-discuss><http://lists.lustre.org/mailman/listinfo/lustre-discuss><http://lists.lustre.org/mailman/listinfo/lustre-discuss>
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
> _______________________________________________
> Lustre-discuss mailing list
> <Lustre-discuss at lists.lustre.org> <Lustre-discuss at lists.lustre.org>
> Lustre-discuss at lists.lustre.org
> <http://lists.lustre.org/mailman/listinfo/lustre-discuss><http://lists.lustre.org/mailman/listinfo/lustre-discuss>
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
>
> Confidentiality Notice: This e-mail message, its contents and any
> attachments to it are confidential to the intended recipient, and may
> contain information that is privileged and/or exempt from disclosure under
> applicable law. If you are not the intended recipient, please immediately
> notify the sender and destroy the original e-mail message and any
> attachments (and any copies that may have been made) from your system or
> otherwise. Any unauthorized use, copying, disclosure or distribution of
> this information is strictly prohibited. Email addresses that end with a
> “-c” identify the sender as a Fusion-io contractor.
>   ­­
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20120220/d2b48663/attachment.htm>


More information about the lustre-discuss mailing list