[lustre-discuss] Performance of Single Shared File.
Cong Le Duy
congld at innotech-vn.com
Wed Jul 31 19:52:09 PDT 2024
Hi all,
I am testing a Lustre system that includes 1 MGS, 2 MDS, and 8 OSS with 8 OSTs running RAID 6 (8d+2p). Each OST's performance is approximately 16 GB/s for WRITE and 33 GB/s for READ (measured with FIO test: blocksize=1m, iodepth=64, numjob=2 sequential test). The system has 16 clients.
I am encountering issues with performance testing using IOR with the following options:
```
mpirun --allow-run-as-root --mca pml ucx -x UCX_TLS=rc_mlx5,ud_mlx5,self -x UCX_NET_DEVICES=mlx5_0:1 --mca btl ^openib --hostfile mphost10 -np <number_of_process> -map-by node ior -w -r -b 2m -t 2m -C -s 4000 -k -e -o /lustre/testFS/ior/iortest
```
The stripe_count is set equal to the number of processes (overstriping), and the stripe_size is equal to the block size (2m). The issues I am facing are:
1. Performance does not increase beyond 2 processes per client. With 1 client and 1 OST, I achieve approximately 2 GB/s for WRITE. With 2 clients and 4 processes, I achieve 4 GB/s. To reach 16 GB/s, I need to use 16 clients with 2 processes per client.
Stripe count
NP
Write (MB/s)
Read (MB/s)
1
1
1843.57
1618.57
1
2
2079.28
1914.32
2
2
2579.28
2298.19
2
4
1337.38
1310.23
16
16
1313.24
1345.24
16
32
1455.45
1398.23
32
32
1477.75
1410.68
800
32
1326.41
1210.13
1. Performance does not improve by adding more OSTs. With 2 OSTs and 2 clients, the performance remains at 4 GB/s, and with 16 clients, the performance is only equivalent to 1 OST.
I am wondering why the performance does not scale after 2 processes per client. Could it be that overstriping alone is not sufficient to enhance performance for Single Shared File mode? Are there any additional settings I should consider configuring beyond overstriping?
The results of obdfilter-survey and lnet do not show any bottleneck.
I am using Lustre 2.15.4 with Rocky Linux 8.9 and kernel 4.18.0-513.9.1.el8_lustre.x86_64.
- Information of MGS/MDS/OSS: 16 CPUs, 32 GB RAM.
- Information of Clients: AMD EPYC 7662 64 core x2, 512 GB RAM.
The network connection is InfiniBand with 400 Gbps bandwidth.
Other settings on the Lustre cluster:
```
# Clients:
options lnet networks="o2ib(ib0)"
options ko2iblnd peer_credits=32 peer_credits_hiw=16 credits=256 concurrent_sends=64
lctl set_param osc.*.max_pages_per_rpc=4096
lctl set_param osc.*.checksums=0
lctl set_param osc.*.max_rpcs_in_flight=16
# OSSs:
options lnet networks="o2ib(ib0)"
options libcfs cpu_npartitions=1
options ko2iblnd peer_credits=32 peer_credits_hiw=16 credits=256 concurrent_sends=64 nscheds=8
options ost oss_num_threads=128
lctl set_param *.*.brw_size=16
lctl set_param osd-ldiskfs.*.writethrough_cache_enable=0
lctl set_param osd-ldiskfs.*.read_cache_enable=0
# MGS – MDSs
options lnet networks="o2ib(ib0)"
options libcfs cpu_npartitions=1
options ko2iblnd peer_credits=32 peer_credits_hiw=16 credits=256 concurrent_sends=64 nscheds=8
```
Thank you for your helping.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20240801/59b5685a/attachment-0001.htm>
More information about the lustre-discuss
mailing list