[lustre-discuss] OST went back in time: no(?) hardware issue

Thomas Roth t.roth at gsi.de
Tue Oct 3 07:22:55 PDT 2023


Hi all,

in our Lustre 2.12.5 system, we have "OST went back in time" after OST hardware replacement:
- hardware had reached EOL
- we set `max_create_count=0` for these OSTs, searched for and migrated off the files of these OSTs
- formatted the new OSTs with `--replace` and the old indices
- all OSTs are on ZFS
- set the OSTs `active=0` on our 3 MDTs
- moved in the new hardware, reused the old NIDs, old OST indices, mounted the OSTs
- set the OSTs `active=1`
- ran `lfsck` on all servers
- set `max_create_count=200` for these OSTs

Now the "OST went back in time" messages appeard in the MDS logs.

This doesn't quite fit the description in the manual. There were no crashes or power losses. I cannot understand how which cache might have been lost.
The transaction numbers quoted in the error are both large, eg. `transno 55841088879 was previously committed, server now claims 4294992012`

What should we do? Give `lfsck` another try?

Regards,
Thomas


-- 
--------------------------------------------------------------------
Thomas Roth
Department: IT

GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de

Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528
Managing Directors / Geschäftsführung:
Professor Dr. Paolo Giubellino, Jörg Blaurock
Chairman of the Supervisory Board / Vorsitzender des GSI-Aufsichtsrats:
State Secretary / Staatssekretär Dr. Volkmar Dietz



More information about the lustre-discuss mailing list