[lustre-discuss] Tools to check a lustre

Dennis Nelson dnelson at ddn.com
Mon Oct 11 05:20:25 PDT 2021


Have you tried lfs check servers on the login node?

Sent from my iPhone

On Oct 11, 2021, at 2:58 AM, Sid Young via lustre-discuss <lustre-discuss at lists.lustre.org> wrote:



I'm having trouble diagnosing where the problem lies in  my Lustre installation, clients are 2.12.6 and I have a /home and /lustre filesystems using Lustre.

/home has 4 OSTs and /lustre is made up of 6 OSTs. lfs df shows all OSTs as ACTIVE.

The /lustre file system appears fine, I can ls into every directory.

When people log into the login node, it appears to lockup. I have shut down everything and remounted the OSTs and MDTs etc in order with no errors reporting but I'm getting the lockup issue soon after a few people log in.
The backend network is 100G Ethernet using ConnectX5 cards and the OS is Cento 7.9, everything was installed as RPMs and updates are disabled in yum.conf

Two questions to start with:
Is there a command line tool to check each OST individually?
Apart from /var/log/messages, is there a lustre specific log I can monitor on the login node to see errors when I hit /home...



Sid Young
_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20211011/23533af1/attachment-0001.html>


More information about the lustre-discuss mailing list