[lustre-discuss] Cannot mount from Lustre from client any longer
Mohr Jr, Richard Frank
rmohr at utk.edu
Thu Jun 27 07:46:15 PDT 2019
> On Jun 27, 2019, at 8:16 AM, Miguel Santos Novoa <miguelsn at met.no> wrote:
> For the last couple of weeks we have been adding and removing OSTs, and we were also doing tests with a client using Lustre version 2.12, which this seems our main hypothesis of the problem. We are not sure what is causing this behavior.
> From all our clients, we cannot mount lustre any longer, although the active mounts are still serving and no other element seems to be affected. Because of the nature and importance we have not and we don't want to give it a try to reboot the MDS/MDT server.
It might be a long shot, but you could try dropping caches on the Lustre servers (echo 3 > /proc/sys/vm/drop_caches).
I have an issue on one of my file systems running Lustre 2.9 that seems to be a bug related to the IB stack. After a certain point, the servers start getting memory allocation errors. Existing clients that have lustre mounted work fine, but I can’t mount any new clients. After dropping caches, I can mount new clients again. (I realize that you are using tcp instead of IB, but since the symptoms sound very similar to what I have seen, it might be worth a try.)
Senior HPC System Administrator
National Institute for Computational Sciences
More information about the lustre-discuss