[lustre-discuss] Rhel8.10 Lustre Kernel Performance Decrease

Andreas Dilger adilger at whamcloud.com
Thu Aug 1 20:33:57 PDT 2024


Sorry, you'll have to ask someone at Azure about this.  I don't know anything about what "Premium Lustre" or "Lustre V2" means, and I can't speak to any kind of performance for their systems.

On Jul 30, 2024, at 08:48, Baucum, Rashun <Rashun.Baucum at td.com<mailto:Rashun.Baucum at td.com>> wrote:

Good morning, Andreas.

I apologize for the delay.

Yes, I will provide additional information. I initially asked the question as a general inquiry. The crux of the performance issue we see is that the newer RHEL 8.10 - 2.15.5 lustre with similar builds do not hit the same average performance as we have seen previously.

Which version of Lustre for the old and new kernel and was it the same before upgrading to RHEL8.10?

  *   Previous Kernel : 3.10.0-1160.49.1.el7_lustre, RHEL 7.9
  *   Current Kernel : 4.18.0-553.5.1.el8_lustre, RHEL 8.10
  *   It was not the same, we waited for the last minor version of RHEL8.


Which RHEL version are you comparing against, RHEL 8.9?

  *   Comparison Version : RHEL 8.8
  *   Lustre Client Version :  4.18.0-425.3.1.el8_lustre


Have you upgraded both the clients and servers to RHEL 8.10, or only the clients?

  *   Currently both are upgraded to the RHEL 8.10.



Results of FIO testing

Original Production: Initial baseline established ~1 year ago. This build is no longer in use, but performance of non "premium" builds should be approximately around this.
Lustre V2: Current builds and are currently being used as a direct reference point. These used RHEL8 clients while being RHEL7 lustres.
RHEL8.10 – 2.15.5: Builds being tested before we push updates to higher environments.

Between Original Production, Lustre V2 36, and the RHEL8.10 lustres there are minimal changes between builds. In this specific case the only difference between Lustre V2 36 and RHEL8.10 builds are the lustre versions and the RHEL 8.10 Premium Lustre uses Azure's premium SSDs as disks for OSTs. All other builds use standard HDDs for OSTs.

Write Throughput
Original Production
Lustre V2 36
Lustre V2 108
RHEL 8.10 - 2.15.5 - Premium Lustre
RHEL 8.10 - 2.15.5 - Standard Lustre
Latency (sec) Avg
1.485
1.482
1.338
1.596
2.968
IOPS Avg
689
690
765
640
640
Bandwidth (MB/s)
723
724
803
672
362
IO (GB)
4340
4346
4818
4037
2173

Write IOPS
Original Production
Lusre V2 36
Lustre V2 108
RHEL 8.10 - 2.15.5 - Premium Lustre
RHEL 8.10 - 2.15.5 - Standard Lustre
Latency (sec) Avg
10.354
18.709
4.911
5.839
11.821
Avg IOPS
24
13
52
43
21
Bandwidth (MB/s)
26
14
55
45
23
IO (GB)
156
86
328
276
137

Read Throughput
Original Production
Lustre V2 36
Lustre V2 108
RHEL 8.10 - 2.15.5 - Premium Lustre
RHEL 8.10 - 2.15.5 - Standard Lustre
Latency (sec) Avg
1.292
1.223
2.307
1.598
3.022
Avg IOPS
792
836
754
640
338
Bandwidth (MB/s)
831
877
791
672
355
IO (GB)
4986
5269
4750
4033
2134

Read IOPS
Original Production
Lustre V2 36
Lustre V2 108
RHEL 8.10 - 2.15.5 - Premium Lustre
RHEL 8.10 - 2.15.5 - Standard Lustre
Latency (sec) Avg
4.190
5.887
5.076
5.954
11.924
Avg IOPS
61
43
50
42
21
Bandwidth (MB/s)
64
45
53
45
23
IO (GB)
384
274
318
271
135


Thanks,
Rashun Baucum



Internal

From: Andreas Dilger <adilger at whamcloud.com<mailto:adilger at whamcloud.com>>
Sent: Thursday, July 4, 2024 12:40 AM
To: Baucum, Rashun <Rashun.Baucum at td.com<mailto:Rashun.Baucum at td.com>>
Cc: lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>
Subject: Re: [lustre-discuss] Rhel8.10 Lustre Kernel Performance Decrease

CAUTION: EXTERNAL MAIL. DO NOT CLICK ON LINKS OR OPEN ATTACHMENTS YOU DO NOT TRUST
ATTENTION : COURRIEL EXTERNE. NE CLIQUEZ PAS SUR DES LIENS ET N'OUVREZ PAS DE PIÈCES JOINTES AUXQUELS VOUS NE FAITES PAS CONFIANCE

On Jul 3, 2024, at 13:12, Baucum, Rashun via lustre-discuss <lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>> wrote:

Good afternoon,

We have recently started executing performance testing on the new rhel 8.10 lustre kernel. We have noticed an drop in performance in our initial testing. Its roughly a 30-40% drop in total IO observed with our FIO testing. My question is has anyone else noticed any performance decreases?

Hi Rashun,
could you please be more specific about what you are comparing?  Which version of Lustre for the old and new kernel, and was it the same before upgrading to RHEL8.10?  Which RHEL version are you comparing against, RHEL 8.9?  Have you upgraded both the clients and servers to RHEL8.10, or only the clients?

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Whamcloud








If you wish to unsubscribe from receiving commercial electronic messages from TD Bank Group, please click here<http://www.td.com/tdoptout> or go to the following web address: www.td.com/tdoptout<http://www.td.com/tdoptout>
Si vous souhaitez vous désabonner des messages électroniques de nature commerciale envoyés par Groupe Banque TD veuillez cliquer ici<http://www.td.com/tddesab> ou vous rendre à l'adresse www.td.com/tddesab<http://www.td.com/tddesab>

NOTICE: Confidential message which may be privileged. Unauthorized use/disclosure prohibited. If received in error, please go to www.td.com/legal<http://www.td.com/legal> for instructions.
AVIS : Message confidentiel dont le contenu peut être privilégié. Utilisation/divulgation interdites sans permission. Si reçu par erreur, prière d'aller au www.td.com/francais/avis_juridique<http://www.td.com/francais/avis_juridique> pour des instructions.

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Whamcloud







-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20240802/15940f26/attachment-0001.htm>


More information about the lustre-discuss mailing list