[Lustre-discuss] EXTERNAL: Re: LNET Performance Issue

Jeremy Filizetti jeremy.filizetti at gmail.com
Mon Feb 20 16:55:07 PST 2012


Apparently I didn't read your email very carefully since you were actually
only wondering about LNET performance and not client performance. You can
ignore the max_rpcs_in_flight for lnet.  There used to be a
srpc_peer_credits setting in lnet_selftest, not sure if it exists in your
version.  Should be able to find out with "modinfo lnet_selftest"

On Mon, Feb 20, 2012 at 7:43 PM, Jeremy Filizetti <
jeremy.filizetti at gmail.com> wrote:

> Am I reading your earlier post correctly that you have a single server
> acting as the MDS and OSS?  Have you changed your peer_credits and credits
> for ko2iblnd kernel module on the server and client?  You also mentioned
> changing osc.*.max_dirty_mb, you probably need to adjust
> osc.*.max_rpcs_in_flight as well.  Can you post your rpc stats "lctl
> get_param osc.*.rpc_stats"?  I would guess they are bunching up around 7-8
> if your running with the default max_rpcs_in_flight=8.
>
> Jeremy
>
>
>
> On Mon, Feb 20, 2012 at 4:59 PM, Barberi, Carl E <carl.e.barberi at lmco.com>wrote:
>
>>  Thank you.   This did help.  With the concurrency set to 16, I was able
>> to get a max write speed of 1138 MB/s.  Any ideas on how we can make that
>> faster, though?  Ideally, we’d like to get to 1.5 GB/s.****
>>
>> ** **
>>
>> Carl****
>>
>> ** **
>>
>> *From:* Liang Zhen [mailto:liang at whamcloud.com]
>> *Sent:* Thursday, February 16, 2012 1:45 AM
>> *To:* Barberi, Carl E
>> *Cc:* 'lustre-discuss at lists.Lustre.org'
>> *Subject:* EXTERNAL: Re: [Lustre-discuss] LNET Performance Issue****
>>
>> ** **
>>
>> Hi, I assume you are using "size=1M" for brw test right? performance
>> could increase if you set "concurrency" while adding brw test, i.e:
>> --concurrency=16****
>>
>> ** **
>>
>> Liang****
>>
>> ** **
>>
>> On Feb 16, 2012, at 3:30 AM, Barberi, Carl E wrote:****
>>
>>
>>
>> ****
>>
>> We are having issues with LNET performance over Infiniband.  We have a
>> configuration with a single MDT and six (6) OSTs.  The Lustre client I am
>> using to test is configured to use 6 stripes (lfs setstripe -c  6
>> /mnt/lustre).  When I perform a test using the following command:****
>>
>>  ****
>>
>>                 dd if=/dev/zero of=/mnt/lustre/test.dat bs=1M count=2000*
>> ***
>>
>>  ****
>>
>> I typically get a write rate of about 815 MB/s, and we never exceed 848
>> MB/s.  When I run obdfilter-survey, we easily get about 3-4GB/s write
>> speed, but when I run a series of lnet-selftests, the read and write rates
>> range from 850MB/s – 875MB/s max.  I have performed the following
>> optimizations to increase the data rate:****
>>
>>  ****
>>
>> On the Client:****
>>
>> lctl set_param osc.*.checksums=0****
>>
>> lctl set_param osc.*.max_dirty_mb=256****
>>
>>  ****
>>
>> On the OSTs****
>>
>> lctl set_param obdfilter.*.writethrough_cache_enable=0****
>>
>> lctl set_param obdfilter.*.read_cache_enable=0****
>>
>>  ****
>>
>> echo 4096 > /sys/block/<devices>/queue/nr_requests****
>>
>>  ****
>>
>> I have also loaded the ib_sdp module, which also brought an increase in
>> speed.  However, we need to be able to record at no less than 1GB/s, which
>> we cannot achieve right now.  Any thoughts on how I can optimize LNET,
>> which clearly seems to be the bottleneck?****
>>
>>  ****
>>
>> Thank you for any help you can provide,****
>>
>> Carl Barberi****
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss****
>>
>> ** **
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20120220/a972955d/attachment.htm>


More information about the lustre-discuss mailing list