[lustre-discuss] Lustre I/O abnormally slow
Tung-Han Hsieh
thhsieh at twcp1.phys.ntu.edu.tw
Mon Sep 21 21:14:25 PDT 2020
Dear All,
We have a cluster installing Lustre-2.12.4. We occationally encountered
serious I/O slowing down. So I am asking how to fix this problem.
There are one MDT server and one OST server. Since our operating system
is Debian-9.12, we installed Lustre by compiling from the source codes:
- Operating system: Debian-9.12
- Linux kernel: 4.19.123
- Infiniband software: MLNX_OFED_SRC-debian-4.6-1.0.1.1
Infiniband hardware: FDR
- MDT: spl-0.7.13 + zfs-0.7.13 + (Infiniband software) + Lustre-2.12.4
- OST: spl-0.7.13 + zfs-0.7.13 + (Infiniband software) + Lustre-2.12.4
- Client: (Infiniband software) + Lustre-2.12.4
- Some clients connect to MDT/OST through gigabit ethernet (because
they don't have infiniband card), and the others connect through
Infiniband. The ones connect through Infiniband only go through
infiniband, since in these clients we set in /etc/modprobe.d/lustre.conf:
options lnet networks="o2ib0(ib0)"
In the following we only discuss the case of clients connecting through
Infiniband.
With this configuration, we occationally found abnormally I/O slowing
down from one of the clients side to the Lustre file system. When that
happens, the other clients are all normal. There is almost no loading
in the whole cluster.
We have done some tests, as showing below.
1. The timing of normal and abnormal cases are the following (datafile
size: 577 MB):
# time cat /lustre/filesystem/datafile > /dev/null
normally: 0.265s
abnormally: 0.560s
# time cp /lustre/filesystem/datafile /lustre/another_dir/
normally: 1.0s
abnormally: 60s or longer
2. We checked the dmesg in MDT, OST, and clients. There is no message at all.
3. We checked the Infiniband I/O perforamnce by "ib_write_bw". When in both
the normal or abnormal situations, the performance of Infiniband I/O from
the client to the OST are almost the same:
************************************
* Waiting for client to connect... *
************************************
---------------------------------------------------------------------------------------
RDMA_Write BW Test
Dual-port : OFF Device : mlx4_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
CQ Moderation : 100
Mtu : 2048[B]
Link type : IB
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0x09 QPN 0x025a PSN 0x8ad56b RKey 0x10010200 VAddr 0x007f179e3f7000
remote address: LID 0x08 QPN 0x021b PSN 0x8f4018 RKey 0x8010200 VAddr 0x00149df8dc3000
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]
65536 5000 5975.01 5974.74 0.095596
---------------------------------------------------------------------------------------
So it seems that this is not the problem of Infiniband connection,
but probably the problem of Lustre file system.
4. We tried to test the LNet performance of the abnormal case by following:
https://wiki.lustre.org/LNET_Selftest#Appendix:_LNET_Selftest_Wrapper
Our parameters are:
========================================================================
#Output file
ST=lst-output-$(date +%Y-%m-%d-%H:%M:%S)
# Concurrency
CN=64
#Size
SZ=1M
# Length of time to run test (secs)
TM=30
# Which BRW test to run (read or write)
BRW=read
# Checksum calculation (simple or full)
CKSUM=simple
# The LST "from" list -- e.g. Lustre clients. Space separated list of NIDs.
LFROM="192.168.12.1 at o2ib"
# The LST "to" list -- e.g. Lustre servers. Space separated list of NIDs.
LTO="192.168.12.141 at o2ib"
========================================================================
The results of the last 5 outputs are:
[LNet Rates of lto]
[R] Avg: 5497 RPC/s Min: 5497 RPC/s Max: 5497 RPC/s
[W] Avg: 10994 RPC/s Min: 10994 RPC/s Max: 10994 RPC/s
[LNet Bandwidth of lto]
[R] Avg: 0.84 MiB/s Min: 0.84 MiB/s Max: 0.84 MiB/s
[W] Avg: 5497.74 MiB/s Min: 5497.74 MiB/s Max: 5497.74 MiB/s
[LNet Rates of lfrom]
[R] Avg: 11018 RPC/s Min: 11018 RPC/s Max: 11018 RPC/s
[W] Avg: 5509 RPC/s Min: 5509 RPC/s Max: 5509 RPC/s
[LNet Bandwidth of lfrom]
[R] Avg: 5508.93 MiB/s Min: 5508.93 MiB/s Max: 5508.93 MiB/s
[W] Avg: 0.84 MiB/s Min: 0.84 MiB/s Max: 0.84 MiB/s
[LNet Rates of lto]
[R] Avg: 5508 RPC/s Min: 5508 RPC/s Max: 5508 RPC/s
[W] Avg: 11015 RPC/s Min: 11015 RPC/s Max: 11015 RPC/s
[LNet Bandwidth of lto]
[R] Avg: 0.84 MiB/s Min: 0.84 MiB/s Max: 0.84 MiB/s
[W] Avg: 5507.83 MiB/s Min: 5507.83 MiB/s Max: 5507.83 MiB/s
[LNet Rates of lfrom]
[R] Avg: 10974 RPC/s Min: 10974 RPC/s Max: 10974 RPC/s
[W] Avg: 5487 RPC/s Min: 5487 RPC/s Max: 5487 RPC/s
[LNet Bandwidth of lfrom]
[R] Avg: 5487.16 MiB/s Min: 5487.16 MiB/s Max: 5487.16 MiB/s
[W] Avg: 0.84 MiB/s Min: 0.84 MiB/s Max: 0.84 MiB/s
[LNet Rates of lto]
[R] Avg: 5488 RPC/s Min: 5488 RPC/s Max: 5488 RPC/s
[W] Avg: 10974 RPC/s Min: 10974 RPC/s Max: 10974 RPC/s
[LNet Bandwidth of lto]
[R] Avg: 0.84 MiB/s Min: 0.84 MiB/s Max: 0.84 MiB/s
[W] Avg: 5487.36 MiB/s Min: 5487.36 MiB/s Max: 5487.36 MiB/s
5. Finally, we found that if we unplug and plug again the infiniband cable
from the client side, the I/O performance is recovered to normal.
I am asking what else could we do to fix this problem ? Any suggestions
are very appreciated.
Thank you very much.
T.H.Hsieh
More information about the lustre-discuss
mailing list