[lustre-discuss] MDT restart: WAITING non-ready MDTs
Thomas Roth
t.roth at gsi.de
Mon Jan 20 05:33:58 PST 2020
As is to be expected, MDT no. 2 did not like the situation either:
:~# cat /proc/fs/lustre/mdt/hebe-MDT0002/recovery_status
status: WAITING
non-ready MDTs: 0001
recovery_start: 1579525859
time_waited: 23
I was already reading LU-9748 and chewing my nails about an ad-hoc upgrade (this is a Lustre 2.10.6
system), when MDT 1 finally relented, obviously getting the necessary logs now that MDT 2 had been
back and finished its recovery.
Then, of course, MDT 2 also recovered.
In such a situation, would 'lctl abort recovery' help?
Or shutting down all three servers and then restarting 0 - 1 - 2 ?
Regrads,
Thomas
On 20/01/2020 14.00, Thomas Roth wrote:
> Hi all,
>
> I had to restart our MDTs 1 and 2.
> No.2 is still doing a file system check, no. 1 is mounted again and should be in recovery, however:
>
> :~# cat recovery_status
> status: WAITING
> non-ready MDTs: 0002
> recovery_start: 1579524336
> time_waited: 538
>
>
> Seem I have misunderstood the organisation of multiple MDTs: I thought they were independent of each
> other - execept that MDT 0 has the root of the filesystem, of course.
>
> But the others, waiting for everybody to be online?
>
>
> Regards,
> Thomas
>
>
>
--
--------------------------------------------------------------------
Thomas Roth
Department: Informationstechnologie
Location: SB3 2.291
Phone: +49-6159-71 1453 Fax: +49-6159-71 2986
GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de
Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528
Managing Directors / Geschäftsführung:
Professor Dr. Paolo Giubellino, Jörg Blaurock
Chairman of the Supervisory Board / Vorsitzender des GSI-Aufsichtsrats:
State Secretary / Staatssekretär Dr. Volkmar Dietz
More information about the lustre-discuss
mailing list