[lustre-discuss] mdt: unhealthy - healthy

Thomas Roth t.roth at gsi.de
Fri Jul 26 03:28:29 PDT 2019

Hi all,

this morning one of our MDT went 'unhealthy',

> Jul 26 10:15:13 lxmds20 kernel: LustreError: 9510:0:(service.c:3285:ptlrpc_svcpt_health_check())
mdt: unhealthy - request has been waiting 1017s

However, somewhat later,

> lxmds20:~# cat /sys/fs/lustre/health_check

and all Lustre operations seem to be good, too.

Used to be that if an MDT went unhealthy, all of Lustre was in kind of an 'undefined' state, you had
to reboot and fs-check.
This is now Lustre 2.10.6 - can it heal itself reliably, or should we still take some action?


Thomas Roth
Department: Informationstechnologie
Location: SB3 2.291
Phone: +49-6159-71 1453  Fax: +49-6159-71 2986

GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de

Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528
Managing Directors / Geschäftsführung:
Professor Dr. Paolo Giubellino, Ursula Weyrich, Jörg Blaurock
Chairman of the Supervisory Board / Vorsitzender des GSI-Aufsichtsrats:
State Secretary / Staatssekretär Dr. Georg Schütte

More information about the lustre-discuss mailing list