[lustre-discuss] Tools to check a lustre
sid.young at gmail.com
Sun Oct 10 23:07:56 PDT 2021
I'm having trouble diagnosing where the problem lies in my Lustre
installation, clients are 2.12.6 and I have a /home and /lustre
filesystems using Lustre.
/home has 4 OSTs and /lustre is made up of 6 OSTs. lfs df shows all OSTs as
The /lustre file system appears fine, I can *ls *into every directory.
When people log into the login node, it appears to lockup. I have shut down
everything and remounted the OSTs and MDTs etc in order with no
errors reporting but I'm getting the lockup issue soon after a few people
The backend network is 100G Ethernet using ConnectX5 cards and the OS is
Cento 7.9, everything was installed as RPMs and updates are disabled in
Two questions to start with:
Is there a command line tool to check each OST individually?
Apart from /var/log/messages, is there a lustre specific log I can monitor
on the login node to see errors when I hit /home...
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the lustre-discuss