[lustre-discuss] mdt: unhealthy - healthy

Thomas Roth t.roth at gsi.de
Fri Jul 26 03:28:29 PDT 2019


Hi all,

this morning one of our MDT went 'unhealthy',

> Jul 26 10:15:13 lxmds20 kernel: LustreError: 9510:0:(service.c:3285:ptlrpc_svcpt_health_check())
mdt: unhealthy - request has been waiting 1017s
...

However, somewhat later,

> lxmds20:~# cat /sys/fs/lustre/health_check
healthy

and all Lustre operations seem to be good, too.


Used to be that if an MDT went unhealthy, all of Lustre was in kind of an 'undefined' state, you had
to reboot and fs-check.
This is now Lustre 2.10.6 - can it heal itself reliably, or should we still take some action?

Regards
Thomas



-- 
--------------------------------------------------------------------
Thomas Roth
Department: Informationstechnologie
Location: SB3 2.291
Phone: +49-6159-71 1453  Fax: +49-6159-71 2986


GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de

Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528
Managing Directors / Geschäftsführung:
Professor Dr. Paolo Giubellino, Ursula Weyrich, Jörg Blaurock
Chairman of the Supervisory Board / Vorsitzender des GSI-Aufsichtsrats:
State Secretary / Staatssekretär Dr. Georg Schütte


More information about the lustre-discuss mailing list