[lustre-discuss] rhel9.5 --> rhel9.6 ==> unstable, ldlm_cnxx_yyy on metadata server
Äkäslompolo Simppa
simppa.akaslompolo at aalto.fi
Fri Aug 29 02:05:12 PDT 2025
Hi!
I thought to give an early warning / cry for help in case others are facing similar issues.
Coincidence or not, but our lustre setup has become unstable soon after starting to migrate nodes from RHEL9.5 to RHEL9.6.
The key symptom is high load on metadata servers, processes like ldlm_cn03_017 take all available CPU time.
Also memory hogging happened yesterday, which crashed the servers totally.
The processes are distributed lock kernel "daemon"s.
Best regards,
--
- Simppa -
Mr. Simppa Äkäslompolo
High performance computing specialist
Doctor of Science (Tech.)
Aalto Scientific Computing
School of Science, Aalto University, Finland
+358-50-5311327
https://scicomp.aalto.fi/
More information about the lustre-discuss
mailing list