[lustre-discuss] Second read or write performance

fırat yılmaz firatyilmazz at gmail.com
Fri Sep 21 17:50:51 PDT 2018


The problem solved by adding lustre fine tuning parameter  to oss servers
lctl set_param obdfilter.lı-lustrefolder-OST*.brw_size=16

The flock is required by the application running in the filesystem so flock
option is enabled

removing flock decrased the divergence of the flactuations and about %5
performance gain from iml dashboard

Best Regards.

On Sat, Sep 22, 2018 at 12:56 AM Patrick Farrell <paf at cray.com> wrote:

> Just 300 GiB, actually.  But that's still rather large and could skew
> things depending on OST size.
>
> - Patrick
>
> On 9/21/18, 4:43 PM, "lustre-discuss on behalf of Andreas Dilger" <
> lustre-discuss-bounces at lists.lustre.org on behalf of adilger at whamcloud.com>
> wrote:
>
>     On Sep 21, 2018, at 00:43, fırat yılmaz <firatyilmazz at gmail.com>
> wrote:
>     >
>     > Hi Andreas,
>     > Tests are made with dd, The test folder is created by the related
> application company, i will check that when i have connection. OST's has
> %85-86 free space  and filesystem mounted with flock option, i will ask for
> it to remove and test again.
>
>     The "flock" option shouldn't make any difference, unless the
> application is actually doing userspace file locking in the code.
> Definitely "dd" will not be using it.
>
>     What does "lfs getstripe" on the first and second file as well as the
> parent directory show, and "lfs df" for the filesystem?
>
>     > Read test dd if=/vol1/test_read/dd.test.`hostname` of=/dev/null
> bs=1M count=300000
>     >
>     > Write test dd if=/dev/zero of=/vol1/test_read/dd.test.2.`hostname`
> bs=1M count=300000
>
>     This is creating a single file of 300TB in size, so that is definitely
> going to skew the space allocation.
>
>     Cheers, Andreas
>
>     >
>     > On Thu, Sep 20, 2018 at 10:57 PM Andreas Dilger <
> adilger at whamcloud.com> wrote:
>     > On Sep 20, 2018, at 03:07, fırat yılmaz <firatyilmazz at gmail.com>
> wrote:
>     > >
>     > > Hi all,
>     > >
>     > > OS=Redhat 7.4
>     > > Lustre Version: Intel® Manager for Lustre* software 4.0.3.0
>     > > İnterconnect: Mellanox OFED, ConnectX-5
>     > > 72 OST over 6 OSS with HA
>     > > 1mdt and 1 mgt on 2 MDS with HA
>     > >
>     > > Lustre servers fine tuning parameters:
>     > > lctl set_param timeout=600
>     > > lctl set_param ldlm_timeout=200
>     > > lctl set_param at_min=250
>     > > lctl set_param at_max=600
>     > > lctl set_param obdfilter.*.read_cache_enable=1
>     > > lctl set_param obdfilter.*.writethrough_cache_enable=1
>     > > lctl set_param obdfilter.lfs3test-OST*.brw_size=16
>     > >
>     > > Lustre clients fine tuning parameters:
>     > > lctl set_param osc.*.checksums=0
>     > > lctl set_param timeout=600
>     > > lctl set_param at_min=250
>     > > lctl set_param at_max=600
>     > > lctl set_param ldlm.namespaces.*.lru_size=2000
>     > > lctl set_param osc.*OST*.max_rpcs_in_flight=256
>     > > lctl set_param osc.*OST*.max_dirty_mb=1024
>     > > lctl set_param osc.*.max_pages_per_rpc=1024
>     > > lctl set_param llite.*.max_read_ahead_mb=1024
>     > > lctl set_param llite.*.max_read_ahead_per_file_mb=1024
>     > >
>     > > Mountpoint stripe count:72 stripesize:1M
>     > >
>     > > I have a 2Pb lustre filesystem, In the benchmark tests i get the
> optimum values for read and write, but when i start a concurrent I/O
> operation, second job throughput stays around 100-200Mb/s. I have tried
> lovering the stripe count to 36 but since the concurrent operations will
> not occur in a way that keeps OST volume inbalance, i think that its not a
> good way to move on, secondly i saw some discussion about turning off flock
> which ended up unpromising.
>     > >
>     > > As i check the stripe behaviour,
>     > > first operation starts to use first 36 OST
>     > > when a second job starts during a first job, it uses second 36 OST
>     > >
>     > > But when second job starts after 1st job it uses first 36 OST's
> which causes OST unbalance.
>     > >
>     > > Is there a round robin setup that each 36 OST pair used in a round
> robin way?
>     > >
>     > > And any kind of suggestions are appreciated.
>     >
>     > Can you please describe what command you are using for testing.
> Lustre is already using round-robin OST allocation by default, so the
> second job should use the next set of 36 OSTs, unless the file layout has
> been specified e.g. to start on OST0000 or the space usage of the OSTs is
> very imbalanced (more than 17% of the remaining free space).
>     >
>     > Cheers, Andreas
>     > ---
>     > Andreas Dilger
>     > Principal Lustre Architect
>     > Whamcloud
>     >
>     >
>     >
>     >
>     >
>     >
>     >
>
>     Cheers, Andreas
>     ---
>     Andreas Dilger
>     Principal Lustre Architect
>     Whamcloud
>
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180922/79043d27/attachment-0001.html>


More information about the lustre-discuss mailing list