[lustre-discuss] Problem with raising osc.*.max_rpcs_in_flight

Reinoud Bokhorst rbokhorst at astron.nl
Mon Jul 3 12:40:38 PDT 2017


Thanks for your answers. I should check the JIRA first next time :)  It 
is unfortunate not to have the histogram above 32, are there other means 
to get insight in that data?


In the  mean time I found in the code that peer_credits_hiw must be 
between peer_credits/2 and (peer_credits-1).That clarifies that part 
assuming it doesn't affect rpc limits.


Regards Reinoud


On 07/03/17 16:52, Cory Spitz wrote:
>
> This issue is tracked under https://jira.hpdd.intel.com/browse/LU-4533 
> <https://jira.hpdd.intel.com/browse/LU-4533>.
>
> We submitted, but then abandoned a patch because of memory usage.  
> Andreas proposed to do dynamic allocation of the obd_histogram 
> buckets, but that work isn’t completed yet.  Although, it shouldn’t be 
> too hard.  I just added the ‘easy’ label to LU-4533 ☺.
>
> -Cory
>
> -- 
>
> *From: *lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on 
> behalf of Patrick Farrell <paf at cray.com>
> *Date: *Monday, July 3, 2017 at 8:29 AM
> *To: *Andreas Dilger <andreas.dilger at intel.com>, Reinoud Bokhorst 
> <rbokhorst at astron.nl>
> *Cc: *"lustre-discuss at lists.lustre.org" <lustre-discuss at lists.lustre.org>
> *Subject: *Re: [lustre-discuss] Problem with raising 
> osc.*.max_rpcs_in_flight
>
> It definitely is limited to 32 buckets.  We've toyed with raising that 
> limit (and Cray did so internally), but it does use some memory, etc.
>
> So that's almost certainly the issue you're seeing, Reinoud.  RPCs 
> larger than the largest size appear as the largest size.
>
> - Patrick
>
> ------------------------------------------------------------------------
>
> *From:*lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on 
> behalf of Dilger, Andreas <andreas.dilger at intel.com>
> *Sent:* Sunday, July 2, 2017 3:45:03 AM
> *To:* Reinoud Bokhorst
> *Cc:* lustre-discuss at lists.lustre.org
> *Subject:* Re: [lustre-discuss] Problem with raising 
> osc.*.max_rpcs_in_flight
>
> It may also be that this histogram is limited to 32 buckets?
>
> Cheers, Andreas
>
> > On Jun 30, 2017, at 03:03, Reinoud Bokhorst <rbokhorst at astron.nl> wrote:
> >
> > Hi all,
> >
> > I have a problem with raising the osc.*.max_rpcs_in_flight client
> > setting on our Lustre 2.7.0. I am trying the increase the setting from
> > 32 to 64 but according to osc.*.rpc_stats it isn't being used. The
> > statistics still stop at 31 rpcs with high write request numbers, e.g.
> >
> >                        read                    write
> > rpcs in flight        rpcs   % cum % |       rpcs   % cum %
> > 0:                       0   0   0   |          0 0   0
> > 1:                    7293  38  38   |       2231 16  16
> > 2:                    3872  20  59   |       1196 8  25
> > 3:                    1851   9  69   |        935 6  31
> > --SNIP--
> > 28:                      0   0 100   |         89 0  87
> > 29:                      0   0 100   |         90 0  87
> > 30:                      0   0 100   |         94 0  88
> > 31:                      0   0 100   |       1573  11 100
> >
> > I have modified some ko2iblnd driver parameters in an attempt to get it
> > working:
> >
> > options ko2iblnd peer_credits=128 peer_credits_hiw=128 credits=2048
> > concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048
> > fmr_flush_trigger=512 fmr_cache=1
> >
> > Specifically I raised peer_credits_hiw to 128 as I've understood that it
> > must be twice the value of max_rpcs_in_flight. Checking the module
> > parameters that were actually loaded, I noticed that it was set to 127.
> > So apparently it must be smaller than peers_credits. After noticing this
> > I tried setting max_rpcs_in_flight to 60 but that didn't help either.
> > Are there any other parameters affecting the max rpcs? Do all settings
> > have to be powers of 2?
> >
> > Related question; documentation on the driver parameters and how it all
> > hangs together is rather scarce on the internet. Does anyone have some
> > good pointers?
> >
> > Thanks,
> > Reinoud Bokhorst
> >
> >
> > _______________________________________________
> > lustre-discuss mailing list
> > lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20170703/ee2d671d/attachment.htm>


More information about the lustre-discuss mailing list