[lustre-discuss] Second read or write performance

Fri Sep 21 18:46:11 PDT 2018

Hi Patrick,

Thank you for clarifying flock capabilities.
So many think can cause the difference between 2 test results i saw in
dashboard, now i learn that flock has no effect on it.

Best Regars.

22 Eyl 2018 Cmt 04:14 tarihinde Patrick Farrell <paf at cray.com> şunu yazdı:

> Firat,
>
> I strongly suspect that careful remeasurement of flock on/off will show
> that removing the flock option had no effect at all.  It simply doesn’t DO
> anything like that - it controls a single flag that says, if you use flock
> operations, they work one way, or if it is not set, they work another way.
> It does nothing else, and has no impact on any part of file system
> operation except when flocks are used, and dd does not use flocks. It is
> simply impossible for the setting of the flock option to affect dd or
> performance level or variation, unless something using flocks is running at
> the same time.  (And even then, it would be affecting it indirectly)
>
> I’m pushing back strongly because I’ve repeatedly seen people on the
> mailing speculate about turning flock off as a way to increase performance,
> and it simply isn’t.
>
> - Patrick
>
>
> ------------------------------
> *From:* fırat yılmaz <firatyilmazz at gmail.com>
> *Sent:* Friday, September 21, 2018 7:50:51 PM
> *To:* Patrick Farrell
> *Cc:* adilger at whamcloud.com; lustre-discuss at lists.lustre.org
> *Subject:* Re: [lustre-discuss] Second read or write performance
>
> The problem solved by adding lustre fine tuning parameter  to oss servers
>
> lctl set_param obdfilter.lı-lustrefolder-OST*.brw_size=16
>
> The flock is required by the application running in the filesystem so
> flock option is enabled
>
> removing flock decrased the divergence of the flactuations and about %5
> performance gain from iml dashboard
>
> Best Regards.
>
> On Sat, Sep 22, 2018 at 12:56 AM Patrick Farrell <paf at cray.com> wrote:
>
> Just 300 GiB, actually.  But that's still rather large and could skew
> things depending on OST size.
>
> - Patrick
>
> On 9/21/18, 4:43 PM, "lustre-discuss on behalf of Andreas Dilger" <
> lustre-discuss-bounces at lists.lustre.org on behalf of adilger at whamcloud.com>
> wrote:
>
>     On Sep 21, 2018, at 00:43, fırat yılmaz <firatyilmazz at gmail.com>
> wrote:
>     >
>     > Hi Andreas,
>     > Tests are made with dd, The test folder is created by the related
> application company, i will check that when i have connection. OST's has
> %85-86 free space  and filesystem mounted with flock option, i will ask for
> it to remove and test again.
>
>     The "flock" option shouldn't make any difference, unless the
> application is actually doing userspace file locking in the code.
> Definitely "dd" will not be using it.
>
>     What does "lfs getstripe" on the first and second file as well as the
> parent directory show, and "lfs df" for the filesystem?
>
>     > Read test dd if=/vol1/test_read/dd.test.`hostname` of=/dev/null
> bs=1M count=300000
>     >
>     > Write test dd if=/dev/zero of=/vol1/test_read/dd.test.2.`hostname`
> bs=1M count=300000
>
>     This is creating a single file of 300TB in size, so that is definitely
> going to skew the space allocation.
>
>     Cheers, Andreas
>
>     >
>     > On Thu, Sep 20, 2018 at 10:57 PM Andreas Dilger <
> adilger at whamcloud.com> wrote:
>     > On Sep 20, 2018, at 03:07, fırat yılmaz <firatyilmazz at gmail.com>
> wrote:
>     > >
>     > > Hi all,
>     > >
>     > > OS=Redhat 7.4
>     > > Lustre Version: Intel® Manager for Lustre* software 4.0.3.0
>     > > İnterconnect: Mellanox OFED, ConnectX-5
>     > > 72 OST over 6 OSS with HA
>     > > 1mdt and 1 mgt on 2 MDS with HA
>     > >
>     > > Lustre servers fine tuning parameters:
>     > > lctl set_param timeout=600
>     > > lctl set_param ldlm_timeout=200
>     > > lctl set_param at_min=250
>     > > lctl set_param at_max=600
>     > > lctl set_param obdfilter.*.read_cache_enable=1
>     > > lctl set_param obdfilter.*.writethrough_cache_enable=1
>     > > lctl set_param obdfilter.lfs3test-OST*.brw_size=16
>     > >
>     > > Lustre clients fine tuning parameters:
>     > > lctl set_param osc.*.checksums=0
>     > > lctl set_param timeout=600
>     > > lctl set_param at_min=250
>     > > lctl set_param at_max=600
>     > > lctl set_param ldlm.namespaces.*.lru_size=2000
>     > > lctl set_param osc.*OST*.max_rpcs_in_flight=256
>     > > lctl set_param osc.*OST*.max_dirty_mb=1024
>     > > lctl set_param osc.*.max_pages_per_rpc=1024
>     > > lctl set_param llite.*.max_read_ahead_mb=1024
>     > > lctl set_param llite.*.max_read_ahead_per_file_mb=1024
>     > >
>     > > Mountpoint stripe count:72 stripesize:1M
>     > >
>     > > I have a 2Pb lustre filesystem, In the benchmark tests i get the
> optimum values for read and write, but when i start a concurrent I/O
> operation, second job throughput stays around 100-200Mb/s. I have tried
> lovering the stripe count to 36 but since the concurrent operations will
> not occur in a way that keeps OST volume inbalance, i think that its not a
> good way to move on, secondly i saw some discussion about turning off flock
> which ended up unpromising.
>     > >
>     > > As i check the stripe behaviour,
>     > > first operation starts to use first 36 OST
>     > > when a second job starts during a first job, it uses second 36 OST
>     > >
>     > > But when second job starts after 1st job it uses first 36 OST's
> which causes OST unbalance.
>     > >
>     > > Is there a round robin setup that each 36 OST pair used in a round
> robin way?
>     > >
>     > > And any kind of suggestions are appreciated.
>     >
>     > Can you please describe what command you are using for testing.
> Lustre is already using round-robin OST allocation by default, so the
> second job should use the next set of 36 OSTs, unless the file layout has
> been specified e.g. to start on OST0000 or the space usage of the OSTs is
> very imbalanced (more than 17% of the remaining free space).
>     >
>     > Cheers, Andreas
>     > ---
>     > Andreas Dilger
>     > Principal Lustre Architect
>     > Whamcloud
>     >
>     >
>     >
>     >
>     >
>     >
>     >
>
>     Cheers, Andreas
>     ---
>     Andreas Dilger
>     Principal Lustre Architect
>     Whamcloud
>
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180922/6e7c27b1/attachment-0001.html>