[lustre-discuss] rhel9.5 --> rhel9.6 ==> unstable, ldlm_cnxx_yyy on metadata server

Äkäslompolo Simppa simppa.akaslompolo at aalto.fi
Fri Aug 29 06:00:48 PDT 2025


Hi!

We only migrated our clients. The new version is cray-2.15.B22.

However, it is becoming obvious this was a coincidence.
There was a specific user whose jobs brought the load up to 50 in a few minutes.

Best regards,
--
- Simppa -
Mr. Simppa Äkäslompolo
High performance computing specialist
Doctor of Science (Tech.)
Aalto Scientific Computing
School of Science, Aalto University, Finland

+358-50-5311327
https://scicomp.aalto.fi/



________________________________________
From: Einar Næss Jensen <einar.nass.jensen at ntnu.no>
Sent: Friday, August 29, 2025 14:37
To: lustre-discuss; Äkäslompolo Simppa
Subject: Re: rhel9.5 --> rhel9.6 ==> unstable, ldlm_cnxx_yyy on metadata server

Was it the servers that were migrated or the clients? Or both?

(asking because we upgraded clients to 9.6 (Rocky) very recent, but servers are still on 8.10 (Rocky)

We have not seen this behaviour you describe for our setup.

Best Regards
Einar


________________________________________
From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of Äkäslompolo Simppa via lustre-discuss <lustre-discuss at lists.lustre.org>
Sent: Friday, August 29, 2025 11:05
To: lustre-discuss
Subject: [lustre-discuss] rhel9.5 --> rhel9.6 ==> unstable, ldlm_cnxx_yyy on metadata server

Hi!

I thought to give an early warning / cry for help in case others are facing similar issues.

Coincidence or not, but our lustre setup has become unstable soon after starting to migrate nodes from RHEL9.5 to RHEL9.6.

The key symptom is high load on metadata servers, processes like ldlm_cn03_017 take all available CPU time.
Also memory hogging happened yesterday, which crashed the servers totally.

The processes are distributed lock kernel "daemon"s.

Best regards,

--
- Simppa -
Mr. Simppa Äkäslompolo
High performance computing specialist
Doctor of Science (Tech.)
Aalto Scientific Computing
School of Science, Aalto University, Finland

+358-50-5311327
https://scicomp.aalto.fi/

_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


More information about the lustre-discuss mailing list