[lustre-discuss] Lustre I/O abnormally slow

Tung-Han Hsieh thhsieh at twcp1.phys.ntu.edu.tw
Mon Sep 21 21:14:25 PDT 2020


Dear All,

We have a cluster installing Lustre-2.12.4. We occationally encountered
serious I/O slowing down. So I am asking how to fix this problem.

There are one MDT server and one OST server. Since our operating system
is Debian-9.12, we installed Lustre by compiling from the source codes:

- Operating system: Debian-9.12
- Linux kernel: 4.19.123
- Infiniband software: MLNX_OFED_SRC-debian-4.6-1.0.1.1
  Infiniband hardware: FDR
- MDT: spl-0.7.13 + zfs-0.7.13 + (Infiniband software) + Lustre-2.12.4
- OST: spl-0.7.13 + zfs-0.7.13 + (Infiniband software) + Lustre-2.12.4
- Client: (Infiniband software) + Lustre-2.12.4
- Some clients connect to MDT/OST through gigabit ethernet (because
  they don't have infiniband card), and the others connect through
  Infiniband. The ones connect through Infiniband only go through
  infiniband, since in these clients we set in /etc/modprobe.d/lustre.conf:

	options lnet networks="o2ib0(ib0)"

In the following we only discuss the case of clients connecting through
Infiniband.

With this configuration, we occationally found abnormally I/O slowing
down from one of the clients side to the Lustre file system. When that
happens, the other clients are all normal. There is almost no loading
in the whole cluster.

We have done some tests, as showing below.

1. The timing of normal and abnormal cases are the following (datafile
   size: 577 MB):

   # time cat /lustre/filesystem/datafile > /dev/null
     normally:   0.265s
     abnormally: 0.560s

   # time cp /lustre/filesystem/datafile /lustre/another_dir/ 
     normally:   1.0s
     abnormally: 60s or longer

2. We checked the dmesg in MDT, OST, and clients. There is no message at all.

3. We checked the Infiniband I/O perforamnce by "ib_write_bw". When in both
   the normal or abnormal situations, the performance of Infiniband I/O from
   the client to the OST are almost the same:

************************************
* Waiting for client to connect... *
************************************
---------------------------------------------------------------------------------------
                    RDMA_Write BW Test
 Dual-port       : OFF          Device         : mlx4_0
 Number of qps   : 1            Transport type : IB
 Connection type : RC           Using SRQ      : OFF
 CQ Moderation   : 100
 Mtu             : 2048[B]
 Link type       : IB
 Max inline data : 0[B]
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
---------------------------------------------------------------------------------------
 local address: LID 0x09 QPN 0x025a PSN 0x8ad56b RKey 0x10010200 VAddr 0x007f179e3f7000
 remote address: LID 0x08 QPN 0x021b PSN 0x8f4018 RKey 0x8010200 VAddr 0x00149df8dc3000
---------------------------------------------------------------------------------------
 #bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]   MsgRate[Mpps]
 65536      5000             5975.01            5974.74            0.095596
---------------------------------------------------------------------------------------

   So it seems that this is not the problem of Infiniband connection,
   but probably the problem of Lustre file system.

4. We tried to test the LNet performance of the abnormal case by following:

   https://wiki.lustre.org/LNET_Selftest#Appendix:_LNET_Selftest_Wrapper

   Our parameters are:

   ========================================================================
   #Output file
   ST=lst-output-$(date +%Y-%m-%d-%H:%M:%S)
   # Concurrency
   CN=64
   #Size
   SZ=1M
   # Length of time to run test (secs)
   TM=30
   # Which BRW test to run (read or write)
   BRW=read
   # Checksum calculation (simple or full)
   CKSUM=simple
   # The LST "from" list -- e.g. Lustre clients. Space separated list of NIDs.
   LFROM="192.168.12.1 at o2ib"
   # The LST "to" list -- e.g. Lustre servers. Space separated list of NIDs.
   LTO="192.168.12.141 at o2ib"
   ========================================================================

   The results of the last 5 outputs are:

[LNet Rates of lto]
[R] Avg: 5497     RPC/s Min: 5497     RPC/s Max: 5497     RPC/s
[W] Avg: 10994    RPC/s Min: 10994    RPC/s Max: 10994    RPC/s
[LNet Bandwidth of lto]
[R] Avg: 0.84     MiB/s Min: 0.84     MiB/s Max: 0.84     MiB/s
[W] Avg: 5497.74  MiB/s Min: 5497.74  MiB/s Max: 5497.74  MiB/s
[LNet Rates of lfrom]
[R] Avg: 11018    RPC/s Min: 11018    RPC/s Max: 11018    RPC/s
[W] Avg: 5509     RPC/s Min: 5509     RPC/s Max: 5509     RPC/s
[LNet Bandwidth of lfrom]
[R] Avg: 5508.93  MiB/s Min: 5508.93  MiB/s Max: 5508.93  MiB/s
[W] Avg: 0.84     MiB/s Min: 0.84     MiB/s Max: 0.84     MiB/s
[LNet Rates of lto]
[R] Avg: 5508     RPC/s Min: 5508     RPC/s Max: 5508     RPC/s
[W] Avg: 11015    RPC/s Min: 11015    RPC/s Max: 11015    RPC/s
[LNet Bandwidth of lto]
[R] Avg: 0.84     MiB/s Min: 0.84     MiB/s Max: 0.84     MiB/s
[W] Avg: 5507.83  MiB/s Min: 5507.83  MiB/s Max: 5507.83  MiB/s
[LNet Rates of lfrom]
[R] Avg: 10974    RPC/s Min: 10974    RPC/s Max: 10974    RPC/s
[W] Avg: 5487     RPC/s Min: 5487     RPC/s Max: 5487     RPC/s
[LNet Bandwidth of lfrom]
[R] Avg: 5487.16  MiB/s Min: 5487.16  MiB/s Max: 5487.16  MiB/s
[W] Avg: 0.84     MiB/s Min: 0.84     MiB/s Max: 0.84     MiB/s
[LNet Rates of lto]
[R] Avg: 5488     RPC/s Min: 5488     RPC/s Max: 5488     RPC/s
[W] Avg: 10974    RPC/s Min: 10974    RPC/s Max: 10974    RPC/s
[LNet Bandwidth of lto]
[R] Avg: 0.84     MiB/s Min: 0.84     MiB/s Max: 0.84     MiB/s
[W] Avg: 5487.36  MiB/s Min: 5487.36  MiB/s Max: 5487.36  MiB/s

5. Finally, we found that if we unplug and plug again the infiniband cable
   from the client side, the I/O performance is recovered to normal.


I am asking what else could we do to fix this problem ? Any suggestions
are very appreciated.

Thank you very much.


T.H.Hsieh


More information about the lustre-discuss mailing list