[lustre-discuss] rhel9.5 --> rhel9.6 ==> unstable, ldlm_cnxx_yyy on metadata server
Peter Jones
pjones at whamcloud.com
Fri Aug 29 03:51:28 PDT 2025
Simppa
Details about what Lustre version (and whether ldiskfs/ZFS backend) could well be relevant to this situation.
Peter
On 8/29/25, 2:09 AM, "lustre-discuss on behalf of Äkäslompolo Simppa via lustre-discuss" <lustre-discuss-bounces at lists.lustre.org <mailto:lustre-discuss-bounces at lists.lustre.org> on behalf of lustre-discuss at lists.lustre.org <mailto:lustre-discuss at lists.lustre.org>> wrote:
Hi!
I thought to give an early warning / cry for help in case others are facing similar issues.
Coincidence or not, but our lustre setup has become unstable soon after starting to migrate nodes from RHEL9.5 to RHEL9.6.
The key symptom is high load on metadata servers, processes like ldlm_cn03_017 take all available CPU time.
Also memory hogging happened yesterday, which crashed the servers totally.
The processes are distributed lock kernel "daemon"s.
Best regards,
--
- Simppa -
Mr. Simppa Äkäslompolo
High performance computing specialist
Doctor of Science (Tech.)
Aalto Scientific Computing
School of Science, Aalto University, Finland
+358-50-5311327
https://scicomp.aalto.fi/ <https://scicomp.aalto.fi/>
_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org <mailto:lustre-discuss at lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org <http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org>
More information about the lustre-discuss
mailing list