[lustre-discuss] Tools to check a lustre

Sid Young sid.young at gmail.com
Sun Oct 10 23:07:56 PDT 2021


I'm having trouble diagnosing where the problem lies in  my Lustre
installation, clients are 2.12.6 and I have a /home and /lustre
filesystems using Lustre.

/home has 4 OSTs and /lustre is made up of 6 OSTs. lfs df shows all OSTs as
ACTIVE.

The /lustre file system appears fine, I can *ls *into every directory.

When people log into the login node, it appears to lockup. I have shut down
everything and remounted the OSTs and MDTs etc in order with no
errors reporting but I'm getting the lockup issue soon after a few people
log in.
The backend network is 100G Ethernet using ConnectX5 cards and the OS is
Cento 7.9, everything was installed as RPMs and updates are disabled in
yum.conf

Two questions to start with:
Is there a command line tool to check each OST individually?
Apart from /var/log/messages, is there a lustre specific log I can monitor
on the login node to see errors when I hit /home...



Sid Young
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20211011/9af2da66/attachment.html>


More information about the lustre-discuss mailing list