[lustre-discuss] changelog catalog

H.J. Zilverberg h.j.zilverberg at rug.nl
Tue May 2 05:49:30 PDT 2017


Hello all,

We are experiencing some problems with the changelog catalog.
We had this enabled for robin hood but due to circumstances we stopped
robin hood and forgot to disable the changelog.
At first this didn't cause problems, but after a month or two users were
unable to write/delete files. In the logs we got:

kernel: LustreError: 27410:0:(llog_cat.c:82:llog_cat_new_log()) no free
catalog slots for log...

Investigating this issue showed us that there were quite a few records
in the changelog.

[root at pg-mds02 log]# lctl get_param mdd.pghome01-MDT0000.changelog_users
mdd.pghome01-MDT0000.changelog_users=current index: 4758154916
ID    index
cl1   609095732

Which looks like a 32bit number issue.
De-registering the user didn't help, the process was hogging one cpu and
after it ran for 2 days the filesystem was still acting strange.
When creating a new file you would get a bad address error back, but the
file was created. Editing the file after that did work.

So we decided to kill it, reboot the servers, fsck the file systems and
mount it all again. This worked without a problem.
To test if the changelog catalog was cleared, we decided to register a
changelog catalog user again and this time the current index matched the
user, which is what we expected. Unfortunately when we deregistered the
user again, the process went back to hogging one cpu and managed to
crash the server after a day.

In short we now have a working file system but are a little concerned
about the leftovers from the changelog catalog.
We think that there are still loads of uncleared records that don't
really affect the system now, but could become an issue when we want the
use the changelog catalog again.
Is there anyway to find out how many records are left?
Is it possible to remove these records manually?
We are running Lustre 2.5.3-RC1

Kind regards,
Henk-Jan Zilverberg








-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3627 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20170502/62146ce6/attachment.bin>


More information about the lustre-discuss mailing list