[lustre-discuss] Changelog record cleanup in /O/1/d*

Prescott,Craig P prescott at rc.ufl.edu
Mon Dec 5 13:02:07 PST 2016


We were running 2.5.3.90 with changelogs enabled earlier this summer.  We ran into a catalog corruption issue (LU-6556) - we decided to deregister our changelog users, move the CONFIGS/changelog_{catalog,users} files out of the way, and carry on until we had an opportunity to upgrade.  We did not remove anything from /O/1/d* at that time (though we probably should have).


We've observed that mounting our MDT can take several-to-many minutes - I can see with iostat that the MDT is very busy with reads while it is being mounted.  I suspect that those stale files in /O/1/d* are the reason (there are lots of them), as they are processed by the OSP sync at MDT startup.   I looked with debugfs at the /O/1/d* directories - there are 1000s of files and their timestamps are consistent with when we were using changelogs.  I dumped a few randomly selected ones and checked with llog_reader that the types of records they contain are CHANGELOG_REC (type=10660000).


At the least, I think we should to remove the files in /O/1/d* that contain CHANGELOG_REC entries.  Can I just delete every file in /O/1/d*, or do I need to be careful and only remove the CHANGELOG_REC entries?


The reason I ask is that I do see a handful of files that are not changelog-related in these directories - their timestamps are newer and their record type as reported by llog_reader is not CHANGELOG_REC or CHANGELOG_USER.  There are only a small number of such files, though.


Thanks,

Craig Prescott

University of Florida Research Computing
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20161205/ab89633d/attachment.htm>


More information about the lustre-discuss mailing list