[lustre-discuss] how to optimize write performances
Degremont, Aurelien
degremoa at amazon.com
Tue Oct 5 02:56:25 PDT 2021
Hello
Direct I/O is impacting the whole I/O path, from client down to ZFS. Agreed ZFS does not support it, but all the rest of I/O path is.
Could you provide you fio command line?
As I said, you need to do _large I/O_ of multiple MB size. If you are just doing 1 MB I/O (assuming stripesize is 1MB), you application will just send 1 RPC at a time to 1 OST, wait for the reply and send the next one. The client cache will help at the beginning, until it is full (32MB max_dirty_mb per OST by default).
What about rpc_stats?
Aurélien
Le 04/10/2021 18:32, « Riccardo Veraldi » <riccardo.veraldi at cnaf.infn.it> a écrit :
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
Hello Aurelien,
I Am using ZFS as lustre backend. ZFS does not support direct I/O.
Ok Lustre does but anyway the performance with direct I/O are worse when
using ZFS backend at least during my tests.
Best
Riccardo
On 10/1/21 2:22 AM, Degremont, Aurelien wrote:
> Hello
>
> To achieve higher throughput with a single threaded process, you should try to limit latencies and parallelize under the hood.
> Try checking the following parameters:
> - Stripe your file across multiple OSTs
> - Do large I/O, multiple MB per write, to let Lustre send multiple RPC to different OSTs
> - Try testing with and without Direct I/O.
>
> What is your 'dd' test command?
> Clear and check rpc stats (sudo lctl set_param osc.*.rpc_stats=clear; sudo lctl get_param osc.*.rpc_stats). Check you are sending large RPCs (pages per rpc).
>
> Aurélien
>
> Le 30/09/2021 18:11, « lustre-discuss au nom de Riccardo Veraldi » <lustre-discuss-bounces at lists.lustre.org au nom de riccardo.veraldi at cnaf.infn.it> a écrit :
>
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> Hello,
>
> I wanted to ask some hint on how I may increase single process
> sequential write performance on Lustre.
>
> I am using Lustre 2.12.7 on RHEL 7.9
>
> I have a number of OSSes with SAS SSDs in raidz. 3 OST per oss and each
> OST is made by 8 SSD in raidz.
>
> On a local test with multiple writes I can write and read from the zpool
> at 7GB/s per OSS.
>
> With Lustre/ZFS backend I can reach peak writes of 5.5GB/s per OSS which
> is ok.
>
> This anyway happens only with several multiple writes at once on the
> filesystem.
>
> A single write cannot perform more than 800MB-1GB/s
>
> Changing the underlying hardware and moving to MVMe slightly improve
> single write performance but just slightly.
>
> What is preventing a single write pattern to perform better ? They are
> XTC files.
>
> Each single SSD has a 500MB/s write capability by factory specs. So
> seems like that with a single write it is not possible to take advantage
> of the
>
> zpool parallelism. I tried also striping but that does not really help much.
>
> Any hint is really appreciated.
>
> Best
>
> Riccardo
>
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
More information about the lustre-discuss
mailing list