[lustre-discuss] Poor(?) Lustre performance

Finn Rawles Malliagh up883044 at myport.ac.uk
Sat Apr 16 21:51:28 PDT 2022


Hi all,

I have just set up a three-node Lustre configuration, and initial testing
shows what I think are slow results. The current configuration is 2 OSS, 1
MDS-MGS; each OSS/MGS has 4x Intel P3600, 1x Intel P4800, Intel E810 100Gbe
eth, 2x 6252, 380GB dram
I am using Lustre 2.12.8, ZFS 0.7.13, ice-1.8.3, rdma-core-35.0 (RoCEv2 is
enabled)
All zpools are setup identical for OST1, OST2, and MDT1

[root at stor3 ~]# zpool status
  pool: osstank
 state: ONLINE
  scan: none requested
config:
        NAME        STATE     READ WRITE CKSUM
        osstank     ONLINE       0     0     0
          nvme1n1   ONLINE       0     0     0
          nvme2n1   ONLINE       0     0     0
          nvme3n1   ONLINE       0     0     0
        cache
          nvme0n1   ONLINE       0     0     0

When running "./io500 ./config-minimalLUST.ini" on my lustre client, I get
these performance numbers:
IO500 version io500-isc22_v1 (standard)
[RESULT]       ior-easy-write        1.173435 GiB/s : time 31.703 seconds
[INVALID]
[RESULT]    mdtest-easy-write        0.931693 kIOPS : time 31.028 seconds
[INVALID]
[RESULT]       ior-hard-write        0.821624 GiB/s : time 1.070 seconds
[INVALID]
[RESULT]    mdtest-hard-write        0.427000 kIOPS : time 31.070 seconds
[INVALID]
[RESULT]                 find       25.311534 kIOPS : time 1.631 seconds
[RESULT]        ior-easy-read        5.177930 GiB/s : time 7.187 seconds
[RESULT]     mdtest-easy-stat        0.570021 kIOPS : time 50.067 seconds
[RESULT]        ior-hard-read        5.331791 GiB/s : time 0.167 seconds
[RESULT]     mdtest-hard-stat        1.834985 kIOPS : time 7.998 seconds
[RESULT]   mdtest-easy-delete        1.715750 kIOPS : time 17.308 seconds
[RESULT]     mdtest-hard-read        1.006240 kIOPS : time 13.759 seconds
[RESULT]   mdtest-hard-delete        1.624117 kIOPS : time 8.910 seconds
[SCORE ] Bandwidth 2.271383 GiB/s : IOPS 1.526825 kiops : TOTAL 1.862258
[INVALID]

When running "./io500 ./config-minimalLOCAL.ini" on a singular locally
mounted ZFS pool I get the following performance numbers:
IO500 version io500-isc22_v1 (standard)
[RESULT]       ior-easy-write        1.304500 GiB/s : time 33.302 seconds
[INVALID]
[RESULT]    mdtest-easy-write       47.979181 kIOPS : time 1.838 seconds
[INVALID]
[RESULT]       ior-hard-write        0.485283 GiB/s : time 1.806 seconds
[INVALID]
[RESULT]    mdtest-hard-write       27.801814 kIOPS : time 2.443 seconds
[INVALID]
[RESULT]                 find     1384.774433 kIOPS : time 0.074 seconds
[RESULT]        ior-easy-read        3.078668 GiB/s : time 14.111 seconds
[RESULT]     mdtest-easy-stat      343.232733 kIOPS : time 1.118 seconds
[RESULT]        ior-hard-read        3.183521 GiB/s : time 0.275 seconds
[RESULT]     mdtest-hard-stat      333.241620 kIOPS : time 1.123 seconds
[RESULT]   mdtest-easy-delete       45.723381 kIOPS : time 1.884 seconds
[RESULT]     mdtest-hard-read       73.637312 kIOPS : time 1.546 seconds
[RESULT]   mdtest-hard-delete       42.191867 kIOPS : time 1.956 seconds
[SCORE ] Bandwidth 1.578256 GiB/s : IOPS 114.726763 kiops : TOTAL 13.456159
[INVALID]

I have run an iperf3 test and I was able to reach speeds of around 40GbE so
I don't think the network links are the issue (Maybe it's something to do
with lnet?)

If anyone more knowledgeable than me would please educate me on why the
performance of the local three disk ZFS is more performant than the lustre
FS.
I'm very new to this kind of benchmarking so it may also be that I am
misinterpreting the results/ not applying the test correctly.

cat ./config-minimalLUST.ini
[global]
datadir = /mnt/lustre
timestamp-datadir = TRUE
resultdir = ./results
timestamp-resultdir = TRUE
api = POSIX
drop-caches = FALSE
drop-caches-cmd = sudo -n bash -c "echo 3 > /proc/sys/vm/drop_caches"
verbosity = 1
[debug]
stonewall-time = 300
[ior-easy]
transferSize = 1m
blockSize = 100000m
filePerProc = FALSE
uniqueDir = FALSE
[ior-easy-write]
[mdtest-easy]
n = 10000000
[mdtest-easy-write]
[ior-hard]
segmentCount = 10000000
[ior-hard-write]
[mdtest-hard]
n = 10000000
[mdtest-hard-write]
[find]
nproc = 1
pfind-queue-length = 10000
pfind-steal-next = FALSE
pfind-parallelize-single-dir-access-using-hashing = FALSE
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20220417/891d66a6/attachment.html>


More information about the lustre-discuss mailing list