[lustre-discuss] Performance of Single Shared File.

Cong Le Duy congld at innotech-vn.com
Wed Jul 31 19:52:09 PDT 2024


Hi all,

I am testing a Lustre system that includes 1 MGS, 2 MDS, and 8 OSS with 8 OSTs running RAID 6 (8d+2p). Each OST's performance is approximately 16 GB/s for WRITE and 33 GB/s for READ (measured with FIO test: blocksize=1m, iodepth=64, numjob=2 sequential test). The system has 16 clients.

I am encountering issues with performance testing using IOR with the following options:
```

mpirun --allow-run-as-root --mca pml ucx -x UCX_TLS=rc_mlx5,ud_mlx5,self -x UCX_NET_DEVICES=mlx5_0:1 --mca btl ^openib --hostfile mphost10 -np <number_of_process> -map-by node ior -w -r -b 2m -t 2m -C -s 4000 -k -e -o /lustre/testFS/ior/iortest
```


The stripe_count is set equal to the number of processes (overstriping), and the stripe_size is equal to the block size (2m). The issues I am facing are:

  1.  Performance does not increase beyond 2 processes per client. With 1 client and 1 OST, I achieve approximately 2 GB/s for WRITE. With 2 clients and 4 processes, I achieve 4 GB/s. To reach 16 GB/s, I need to use 16 clients with 2 processes per client.
Stripe count
NP
Write (MB/s)
Read (MB/s)
1
1
1843.57
1618.57
1
2
2079.28
1914.32
2
2
2579.28
2298.19
2
4
1337.38
1310.23
16
16
1313.24
1345.24
16
32
1455.45
1398.23
32
32
1477.75
1410.68
800
32
1326.41
1210.13


  1.  Performance does not improve by adding more OSTs. With 2 OSTs and 2 clients, the performance remains at 4 GB/s, and with 16 clients, the performance is only equivalent to 1 OST.

I am wondering why the performance does not scale after 2 processes per client. Could it be that overstriping alone is not sufficient to enhance performance for Single Shared File mode? Are there any additional settings I should consider configuring beyond overstriping?

The results of obdfilter-survey and lnet do not show any bottleneck.

I am using Lustre 2.15.4 with Rocky Linux 8.9 and kernel 4.18.0-513.9.1.el8_lustre.x86_64.
- Information of MGS/MDS/OSS: 16 CPUs, 32 GB RAM.
- Information of Clients: AMD EPYC 7662 64 core x2, 512 GB RAM.
The network connection is InfiniBand with 400 Gbps bandwidth.

Other settings on the Lustre cluster:

```

# Clients:

options lnet networks="o2ib(ib0)"

options ko2iblnd peer_credits=32 peer_credits_hiw=16 credits=256 concurrent_sends=64

lctl set_param osc.*.max_pages_per_rpc=4096

lctl set_param osc.*.checksums=0

lctl set_param osc.*.max_rpcs_in_flight=16



# OSSs:
options lnet networks="o2ib(ib0)"

options libcfs cpu_npartitions=1

options ko2iblnd peer_credits=32 peer_credits_hiw=16 credits=256 concurrent_sends=64 nscheds=8

options ost oss_num_threads=128

lctl set_param *.*.brw_size=16

lctl set_param osd-ldiskfs.*.writethrough_cache_enable=0

lctl set_param osd-ldiskfs.*.read_cache_enable=0

# MGS – MDSs

options lnet networks="o2ib(ib0)"

options libcfs cpu_npartitions=1

options ko2iblnd peer_credits=32 peer_credits_hiw=16 credits=256 concurrent_sends=64 nscheds=8

```
Thank you for your helping.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20240801/59b5685a/attachment-0001.htm>


More information about the lustre-discuss mailing list