[lustre-discuss] parallel write/reads problem

Riccardo Veraldi Riccardo.Veraldi at cnaf.infn.it
Thu May 10 21:55:28 PDT 2018


Hello,
So far I am not able to solve this problem on my Lustre setup.
I can reach very good performance with multi threaded writes or reads,
that are sequential writes and sequential reads at different times.
I can saturate Infiniband FDR capabilities reaching 6GB/s.
The problem rises when while writing I start also reading the same file
or even a different file.
In our I/O model there are writers and readers which start reading files
after a while they're being written. In this case read
performances drops dramatically. Writes go up to 6GB/s but reads have a
barrier and won't go more than 3GB/s.
I Tried all kind of optimizations. ZFS is performing very well itself,
but when Lustre is on top of it I have this problem.
Infiniband is working at full speed and Lnet test also is at full speed.
So I do not understand while I have concurrent writes/reads the reading
performances go down.

I also tweaked the ko2blnd parameters to gain more parallelism:

options ko2iblnd timeout=100 peer_credits=63 credits=2560
concurrent_sends=63 ntx=2048 fmr_pool_size=1280 fmr_flush_trigger=1024
ntx=5120

then on OSS side:

lctl set_param timeout=600
lctl set_param ldlm_timeout=200
lctl set_param at_min=250
lctl set_param at_max=600

on client side:

lctl set_param osc.*.checksums=0
lctl set_param timeout=600
lctl set_param at_min=250
lctl set_param at_max=600
lctl set_param ldlm.namespaces.*.lru_size=2000
lctl set_param osc.*.max_rpcs_in_flight=64
lctl set_param osc.*.max_dirty_mb=1024
lctl set_param llite.*.max_read_ahead_mb=1024
lctl set_param llite.*.max_cached_mb=81920
lctl set_param llite.*.max_read_ahead_per_file_mb=1024
lctl set_param subsystem_debug=0

I tried to set

lctl set_param osc.*.max_pages_per_rpc=1024

but it is not allowed...

 lctl set_param osc.*.max_pages_per_rpc=1024
error: set_param: setting
/proc/fs/lustre/osc/drplu-OST0001-osc-ffff881ed6b05800/max_pages_per_rpc=1024:
Numerical result out of range


any other idea on what I may work on to get better performance on
concurrent writes/reads ?

thank you


Rick





More information about the lustre-discuss mailing list