[lustre-discuss] Second read or write performance
fırat yılmaz
firatyilmazz at gmail.com
Fri Sep 21 18:46:11 PDT 2018
Hi Patrick,
Thank you for clarifying flock capabilities.
So many think can cause the difference between 2 test results i saw in
dashboard, now i learn that flock has no effect on it.
Best Regars.
22 Eyl 2018 Cmt 04:14 tarihinde Patrick Farrell <paf at cray.com> şunu yazdı:
> Firat,
>
> I strongly suspect that careful remeasurement of flock on/off will show
> that removing the flock option had no effect at all. It simply doesn’t DO
> anything like that - it controls a single flag that says, if you use flock
> operations, they work one way, or if it is not set, they work another way.
> It does nothing else, and has no impact on any part of file system
> operation except when flocks are used, and dd does not use flocks. It is
> simply impossible for the setting of the flock option to affect dd or
> performance level or variation, unless something using flocks is running at
> the same time. (And even then, it would be affecting it indirectly)
>
> I’m pushing back strongly because I’ve repeatedly seen people on the
> mailing speculate about turning flock off as a way to increase performance,
> and it simply isn’t.
>
> - Patrick
>
>
> ------------------------------
> *From:* fırat yılmaz <firatyilmazz at gmail.com>
> *Sent:* Friday, September 21, 2018 7:50:51 PM
> *To:* Patrick Farrell
> *Cc:* adilger at whamcloud.com; lustre-discuss at lists.lustre.org
> *Subject:* Re: [lustre-discuss] Second read or write performance
>
> The problem solved by adding lustre fine tuning parameter to oss servers
>
> lctl set_param obdfilter.lı-lustrefolder-OST*.brw_size=16
>
> The flock is required by the application running in the filesystem so
> flock option is enabled
>
> removing flock decrased the divergence of the flactuations and about %5
> performance gain from iml dashboard
>
> Best Regards.
>
> On Sat, Sep 22, 2018 at 12:56 AM Patrick Farrell <paf at cray.com> wrote:
>
> Just 300 GiB, actually. But that's still rather large and could skew
> things depending on OST size.
>
> - Patrick
>
> On 9/21/18, 4:43 PM, "lustre-discuss on behalf of Andreas Dilger" <
> lustre-discuss-bounces at lists.lustre.org on behalf of adilger at whamcloud.com>
> wrote:
>
> On Sep 21, 2018, at 00:43, fırat yılmaz <firatyilmazz at gmail.com>
> wrote:
> >
> > Hi Andreas,
> > Tests are made with dd, The test folder is created by the related
> application company, i will check that when i have connection. OST's has
> %85-86 free space and filesystem mounted with flock option, i will ask for
> it to remove and test again.
>
> The "flock" option shouldn't make any difference, unless the
> application is actually doing userspace file locking in the code.
> Definitely "dd" will not be using it.
>
> What does "lfs getstripe" on the first and second file as well as the
> parent directory show, and "lfs df" for the filesystem?
>
> > Read test dd if=/vol1/test_read/dd.test.`hostname` of=/dev/null
> bs=1M count=300000
> >
> > Write test dd if=/dev/zero of=/vol1/test_read/dd.test.2.`hostname`
> bs=1M count=300000
>
> This is creating a single file of 300TB in size, so that is definitely
> going to skew the space allocation.
>
> Cheers, Andreas
>
> >
> > On Thu, Sep 20, 2018 at 10:57 PM Andreas Dilger <
> adilger at whamcloud.com> wrote:
> > On Sep 20, 2018, at 03:07, fırat yılmaz <firatyilmazz at gmail.com>
> wrote:
> > >
> > > Hi all,
> > >
> > > OS=Redhat 7.4
> > > Lustre Version: Intel® Manager for Lustre* software 4.0.3.0
> > > İnterconnect: Mellanox OFED, ConnectX-5
> > > 72 OST over 6 OSS with HA
> > > 1mdt and 1 mgt on 2 MDS with HA
> > >
> > > Lustre servers fine tuning parameters:
> > > lctl set_param timeout=600
> > > lctl set_param ldlm_timeout=200
> > > lctl set_param at_min=250
> > > lctl set_param at_max=600
> > > lctl set_param obdfilter.*.read_cache_enable=1
> > > lctl set_param obdfilter.*.writethrough_cache_enable=1
> > > lctl set_param obdfilter.lfs3test-OST*.brw_size=16
> > >
> > > Lustre clients fine tuning parameters:
> > > lctl set_param osc.*.checksums=0
> > > lctl set_param timeout=600
> > > lctl set_param at_min=250
> > > lctl set_param at_max=600
> > > lctl set_param ldlm.namespaces.*.lru_size=2000
> > > lctl set_param osc.*OST*.max_rpcs_in_flight=256
> > > lctl set_param osc.*OST*.max_dirty_mb=1024
> > > lctl set_param osc.*.max_pages_per_rpc=1024
> > > lctl set_param llite.*.max_read_ahead_mb=1024
> > > lctl set_param llite.*.max_read_ahead_per_file_mb=1024
> > >
> > > Mountpoint stripe count:72 stripesize:1M
> > >
> > > I have a 2Pb lustre filesystem, In the benchmark tests i get the
> optimum values for read and write, but when i start a concurrent I/O
> operation, second job throughput stays around 100-200Mb/s. I have tried
> lovering the stripe count to 36 but since the concurrent operations will
> not occur in a way that keeps OST volume inbalance, i think that its not a
> good way to move on, secondly i saw some discussion about turning off flock
> which ended up unpromising.
> > >
> > > As i check the stripe behaviour,
> > > first operation starts to use first 36 OST
> > > when a second job starts during a first job, it uses second 36 OST
> > >
> > > But when second job starts after 1st job it uses first 36 OST's
> which causes OST unbalance.
> > >
> > > Is there a round robin setup that each 36 OST pair used in a round
> robin way?
> > >
> > > And any kind of suggestions are appreciated.
> >
> > Can you please describe what command you are using for testing.
> Lustre is already using round-robin OST allocation by default, so the
> second job should use the next set of 36 OSTs, unless the file layout has
> been specified e.g. to start on OST0000 or the space usage of the OSTs is
> very imbalanced (more than 17% of the remaining free space).
> >
> > Cheers, Andreas
> > ---
> > Andreas Dilger
> > Principal Lustre Architect
> > Whamcloud
> >
> >
> >
> >
> >
> >
> >
>
> Cheers, Andreas
> ---
> Andreas Dilger
> Principal Lustre Architect
> Whamcloud
>
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180922/6e7c27b1/attachment-0001.html>
More information about the lustre-discuss
mailing list