[lustre-discuss] Second read or write performance

Andreas Dilger adilger at whamcloud.com
Thu Sep 20 12:57:15 PDT 2018

On Sep 20, 2018, at 03:07, fırat yılmaz <firatyilmazz at gmail.com> wrote:
> Hi all,
> OS=Redhat 7.4
> Lustre Version: Intel® Manager for Lustre* software
> İnterconnect: Mellanox OFED, ConnectX-5
> 72 OST over 6 OSS with HA
> 1mdt and 1 mgt on 2 MDS with HA
> Lustre servers fine tuning parameters:
> lctl set_param timeout=600
> lctl set_param ldlm_timeout=200
> lctl set_param at_min=250
> lctl set_param at_max=600
> lctl set_param obdfilter.*.read_cache_enable=1
> lctl set_param obdfilter.*.writethrough_cache_enable=1
> lctl set_param obdfilter.lfs3test-OST*.brw_size=16
> Lustre clients fine tuning parameters:
> lctl set_param osc.*.checksums=0
> lctl set_param timeout=600
> lctl set_param at_min=250
> lctl set_param at_max=600
> lctl set_param ldlm.namespaces.*.lru_size=2000
> lctl set_param osc.*OST*.max_rpcs_in_flight=256
> lctl set_param osc.*OST*.max_dirty_mb=1024
> lctl set_param osc.*.max_pages_per_rpc=1024
> lctl set_param llite.*.max_read_ahead_mb=1024
> lctl set_param llite.*.max_read_ahead_per_file_mb=1024
> Mountpoint stripe count:72 stripesize:1M
> I have a 2Pb lustre filesystem, In the benchmark tests i get the optimum values for read and write, but when i start a concurrent I/O operation, second job throughput stays around 100-200Mb/s. I have tried lovering the stripe count to 36 but since the concurrent operations will not occur in a way that keeps OST volume inbalance, i think that its not a good way to move on, secondly i saw some discussion about turning off flock which ended up unpromising.
> As i check the stripe behaviour,
> first operation starts to use first 36 OST
> when a second job starts during a first job, it uses second 36 OST
> But when second job starts after 1st job it uses first 36 OST's which causes OST unbalance.
> Is there a round robin setup that each 36 OST pair used in a round robin way?
> And any kind of suggestions are appreciated.

Can you please describe what command you are using for testing.  Lustre is already using round-robin OST allocation by default, so the second job should use the next set of 36 OSTs, unless the file layout has been specified e.g. to start on OST0000 or the space usage of the OSTs is very imbalanced (more than 17% of the remaining free space).

Cheers, Andreas
Andreas Dilger
Principal Lustre Architect

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180920/55e63231/attachment.sig>

More information about the lustre-discuss mailing list