[lustre-discuss] OST went back in time: no(?) hardware issue

Wed Oct 4 22:17:40 PDT 2023

Hi Andreas,

On 10/5/23 02:30, Andreas Dilger wrote:
> On Oct 3, 2023, at 16:22, Thomas Roth via lustre-discuss <lustre-discuss at lists.lustre.org> wrote:
>>
>> Hi all,
>>
>> in our Lustre 2.12.5 system, we have "OST went back in time" after OST hardware replacement:
>> - hardware had reached EOL
>> - we set `max_create_count=0` for these OSTs, searched for and migrated off the files of these OSTs
>> - formatted the new OSTs with `--replace` and the old indices
>> - all OSTs are on ZFS
>> - set the OSTs `active=0` on our 3 MDTs
>> - moved in the new hardware, reused the old NIDs, old OST indices, mounted the OSTs
>> - set the OSTs `active=1`
>> - ran `lfsck` on all servers
>> - set `max_create_count=200` for these OSTs
>>
>> Now the "OST went back in time" messages appeard in the MDS logs.
>>
>> This doesn't quite fit the description in the manual. There were no crashes or power losses. I cannot understand how which cache might have been lost.
>> The transaction numbers quoted in the error are both large, eg. `transno 55841088879 was previously committed, server now claims 4294992012`
>>
>> What should we do? Give `lfsck` another try?
> 
> Nothing really to see here I think?
> 
> Did you delete LAST_RCVD during the replacement and the OST didn't know what transno was assigned to the last RPCs it sent?  The still-mounted clients have a record of this transno and are surprised that it was reset.  If you unmount and remount the clients the error would go away.

No, I don't think I deleted something during the procedure.
- The old OST was emptied (max_create_count=0) in normal Lustre operations. Last transaction should be ~ last file being moved away.
- Then the OST is deactivated, but only on the MDS, not on the clients.
- Then the new OST, formatted with '--replace', is mounted. It is activated on the MDS. Up to this point no errors.
- Finally, the max_create_count is increased, clients can write.
- Now the MDT throws this error (nothing in the client logs).

According to the manual, what should have happened when I mounted the new OST,
> The MDS and OSS will negotiate the LAST_ID value for the replacement OST.

Ok, this is about LAST_ID, whereever that is on ZFS.

About LAST_RCVD, the manual says (even in the case when the configuration files got lost and have to be recreated):
> The last_rcvd file will be recreated when the OST is first mounted using the default parameters,

So, let's see what happens once the clients remount.
Eventually, then, I should also restart the MDTs?

Regards,
Thomas

> 
> I'm not sure if the clients might try to preserve the next 55B RPCs in memory until the committed transno on the OST catches up, or if they just accept the new transno and get on with life?
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Whamcloud
> 
> 
> 
> 
> 
> 
>