[lustre-discuss] Second read or write performance
fırat yılmaz
firatyilmazz at gmail.com
Fri Sep 21 17:50:51 PDT 2018
The problem solved by adding lustre fine tuning parameter to oss servers
lctl set_param obdfilter.lı-lustrefolder-OST*.brw_size=16
The flock is required by the application running in the filesystem so flock
option is enabled
removing flock decrased the divergence of the flactuations and about %5
performance gain from iml dashboard
Best Regards.
On Sat, Sep 22, 2018 at 12:56 AM Patrick Farrell <paf at cray.com> wrote:
> Just 300 GiB, actually. But that's still rather large and could skew
> things depending on OST size.
>
> - Patrick
>
> On 9/21/18, 4:43 PM, "lustre-discuss on behalf of Andreas Dilger" <
> lustre-discuss-bounces at lists.lustre.org on behalf of adilger at whamcloud.com>
> wrote:
>
> On Sep 21, 2018, at 00:43, fırat yılmaz <firatyilmazz at gmail.com>
> wrote:
> >
> > Hi Andreas,
> > Tests are made with dd, The test folder is created by the related
> application company, i will check that when i have connection. OST's has
> %85-86 free space and filesystem mounted with flock option, i will ask for
> it to remove and test again.
>
> The "flock" option shouldn't make any difference, unless the
> application is actually doing userspace file locking in the code.
> Definitely "dd" will not be using it.
>
> What does "lfs getstripe" on the first and second file as well as the
> parent directory show, and "lfs df" for the filesystem?
>
> > Read test dd if=/vol1/test_read/dd.test.`hostname` of=/dev/null
> bs=1M count=300000
> >
> > Write test dd if=/dev/zero of=/vol1/test_read/dd.test.2.`hostname`
> bs=1M count=300000
>
> This is creating a single file of 300TB in size, so that is definitely
> going to skew the space allocation.
>
> Cheers, Andreas
>
> >
> > On Thu, Sep 20, 2018 at 10:57 PM Andreas Dilger <
> adilger at whamcloud.com> wrote:
> > On Sep 20, 2018, at 03:07, fırat yılmaz <firatyilmazz at gmail.com>
> wrote:
> > >
> > > Hi all,
> > >
> > > OS=Redhat 7.4
> > > Lustre Version: Intel® Manager for Lustre* software 4.0.3.0
> > > İnterconnect: Mellanox OFED, ConnectX-5
> > > 72 OST over 6 OSS with HA
> > > 1mdt and 1 mgt on 2 MDS with HA
> > >
> > > Lustre servers fine tuning parameters:
> > > lctl set_param timeout=600
> > > lctl set_param ldlm_timeout=200
> > > lctl set_param at_min=250
> > > lctl set_param at_max=600
> > > lctl set_param obdfilter.*.read_cache_enable=1
> > > lctl set_param obdfilter.*.writethrough_cache_enable=1
> > > lctl set_param obdfilter.lfs3test-OST*.brw_size=16
> > >
> > > Lustre clients fine tuning parameters:
> > > lctl set_param osc.*.checksums=0
> > > lctl set_param timeout=600
> > > lctl set_param at_min=250
> > > lctl set_param at_max=600
> > > lctl set_param ldlm.namespaces.*.lru_size=2000
> > > lctl set_param osc.*OST*.max_rpcs_in_flight=256
> > > lctl set_param osc.*OST*.max_dirty_mb=1024
> > > lctl set_param osc.*.max_pages_per_rpc=1024
> > > lctl set_param llite.*.max_read_ahead_mb=1024
> > > lctl set_param llite.*.max_read_ahead_per_file_mb=1024
> > >
> > > Mountpoint stripe count:72 stripesize:1M
> > >
> > > I have a 2Pb lustre filesystem, In the benchmark tests i get the
> optimum values for read and write, but when i start a concurrent I/O
> operation, second job throughput stays around 100-200Mb/s. I have tried
> lovering the stripe count to 36 but since the concurrent operations will
> not occur in a way that keeps OST volume inbalance, i think that its not a
> good way to move on, secondly i saw some discussion about turning off flock
> which ended up unpromising.
> > >
> > > As i check the stripe behaviour,
> > > first operation starts to use first 36 OST
> > > when a second job starts during a first job, it uses second 36 OST
> > >
> > > But when second job starts after 1st job it uses first 36 OST's
> which causes OST unbalance.
> > >
> > > Is there a round robin setup that each 36 OST pair used in a round
> robin way?
> > >
> > > And any kind of suggestions are appreciated.
> >
> > Can you please describe what command you are using for testing.
> Lustre is already using round-robin OST allocation by default, so the
> second job should use the next set of 36 OSTs, unless the file layout has
> been specified e.g. to start on OST0000 or the space usage of the OSTs is
> very imbalanced (more than 17% of the remaining free space).
> >
> > Cheers, Andreas
> > ---
> > Andreas Dilger
> > Principal Lustre Architect
> > Whamcloud
> >
> >
> >
> >
> >
> >
> >
>
> Cheers, Andreas
> ---
> Andreas Dilger
> Principal Lustre Architect
> Whamcloud
>
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180922/79043d27/attachment-0001.html>
More information about the lustre-discuss
mailing list