[lustre-discuss] 2.16.1 ptlrpcd infinite loop when machine runs out of RAM

Laura Hild lsh at jlab.org
Wed Feb 5 07:21:10 PST 2025


I wanna say 2.15 added those messages (the obd_memory ones, not the spinning ptlrpcd) to every OoM. I remember seeing them when we first had 2.15 clients and looking them up.  I take it you're not getting a corresponding OoM for each, though?

It is typical for a host to struggle if OoM conditions are happening regularly.  Is there workload manager where you could contain individual jobs' memory usage, and limit the total to something with a bigger margin for the system?



More information about the lustre-discuss mailing list