[lustre-discuss] parallel write/reads problem

Patrick Farrell paf at cray.com
Fri May 11 06:03:19 PDT 2018


Rick,

Your description gets (to my eyes) a little vague in the middle there.

Can you be more specific about the concurrent reading and writing?  This is to a single file, at the same time?  I’m going to assume so and write accordingly.

In general, that will be slow in Lustre, depending on the details of the pattern and striping. Even if your I/O doesn’t actually overlap, Lustre locks larger areas of the file (most commonly, whole stripes) during I/O, then it tries to hold on to the lock so it can use it again.

Normally this behavior is great - the lock is already there when you need it, and that helps performance a ton in various ways.  But in certain specific cases where writes (or writes and reads) are being done to one file at the same time, you end up doing  “lock exchange”, where much of the I/O ends up having to cancel the lock held by another node in order to get its own lock (and this repeats over and over).

There’s no simple, general solution to this problem.  The lockahead feature can help but it requires either that your app use MPIIO already or a lot of coding on your part.  The best bet for you today is probably to increase stripe count as high as you can in your setup - since locking is per stripe, that gives you more locks to play with, which should help some.  (I’d be interested to know how much.)

Best of luck.

Regards,
- Patrick


________________________________
From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of Riccardo Veraldi <Riccardo.Veraldi at cnaf.infn.it>
Sent: Thursday, May 10, 2018 11:55:28 PM
To: lustre-discuss at lists.lustre.org
Subject: [lustre-discuss] parallel write/reads problem

Hello,
So far I am not able to solve this problem on my Lustre setup.
I can reach very good performance with multi threaded writes or reads,
that are sequential writes and sequential reads at different times.
I can saturate Infiniband FDR capabilities reaching 6GB/s.
The problem rises when while writing I start also reading the same file
or even a different file.
In our I/O model there are writers and readers which start reading files
after a while they're being written. In this case read
performances drops dramatically. Writes go up to 6GB/s but reads have a
barrier and won't go more than 3GB/s.
I Tried all kind of optimizations. ZFS is performing very well itself,
but when Lustre is on top of it I have this problem.
Infiniband is working at full speed and Lnet test also is at full speed.
So I do not understand while I have concurrent writes/reads the reading
performances go down.

I also tweaked the ko2blnd parameters to gain more parallelism:

options ko2iblnd timeout=100 peer_credits=63 credits=2560
concurrent_sends=63 ntx=2048 fmr_pool_size=1280 fmr_flush_trigger=1024
ntx=5120

then on OSS side:

lctl set_param timeout=600
lctl set_param ldlm_timeout=200
lctl set_param at_min=250
lctl set_param at_max=600

on client side:

lctl set_param osc.*.checksums=0
lctl set_param timeout=600
lctl set_param at_min=250
lctl set_param at_max=600
lctl set_param ldlm.namespaces.*.lru_size=2000
lctl set_param osc.*.max_rpcs_in_flight=64
lctl set_param osc.*.max_dirty_mb=1024
lctl set_param llite.*.max_read_ahead_mb=1024
lctl set_param llite.*.max_cached_mb=81920
lctl set_param llite.*.max_read_ahead_per_file_mb=1024
lctl set_param subsystem_debug=0

I tried to set

lctl set_param osc.*.max_pages_per_rpc=1024

but it is not allowed...

 lctl set_param osc.*.max_pages_per_rpc=1024
error: set_param: setting
/proc/fs/lustre/osc/drplu-OST0001-osc-ffff881ed6b05800/max_pages_per_rpc=1024:
Numerical result out of range


any other idea on what I may work on to get better performance on
concurrent writes/reads ?

thank you


Rick



_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180511/2d6ecda0/attachment-0001.html>


More information about the lustre-discuss mailing list