[lustre-discuss] no more free slots in catalog

Tue Dec 11 05:13:58 PST 2018

Le 11/12/2018 à 10:28, Julien Rey a écrit :
> Le 10/12/2018 13:33, quentin.bouget at cea.fr a écrit :
>> Le 10/12/2018 à 12:00, Julien Rey a écrit :
>>> Hello,
>>>
>>> We are running lustre 
>>> 2.8.0-RC5--PRISTINE-2.6.32-573.12.1.el6_lustre.x86_64.
>>>
>>> Since thursday we are getting a "bad address" error when trying to 
>>> write on the lustre volume.
>>>
>>> Looking at the logs on the MDS, we are getting this kind of messages :
>>>
>>> Dec 10 06:26:18 localhost kernel: Lustre: 
>>> 9593:0:(llog_cat.c:93:llog_cat_new_log()) lustre-MDD0000: there are 
>>> no more free slots in catalog
>>> Dec 10 06:26:18 localhost kernel: Lustre: 
>>> 9593:0:(llog_cat.c:93:llog_cat_new_log()) Skipped 45157 previous 
>>> similar messages
>>> Dec 10 06:26:18 localhost kernel: LustreError: 
>>> 9593:0:(mdd_dir.c:887:mdd_changelog_ns_store()) lustre-MDD0000: 
>>> cannot store changelog record: type = 6, name = 
>>> 'PEPFOLD-00016_bestene1-mc-SC-min-grompp.log', t = 
>>> [0x20000a58f:0x858e:0x0], p = [0x20000a57d:0x17fd9:0x0]: rc = -28
>>> Dec 10 06:26:18 localhost kernel: LustreError: 
>>> 9593:0:(mdd_dir.c:887:mdd_changelog_ns_store()) Skipped 45157 
>>> previous similar messages
>>>
>>>
>>> I saw here that this issue was supposed to be solved in 2.8.0:
>>> https://jira.whamcloud.com/browse/LU-6556
>>>
>>> Could someone help us unlocking this situation ?
>>>
>>> Thanks.
>>>
>> Hello,
>>
>> The log messages don't point at a "bad address" issue but rather at a 
>> "no space left on device" one ("rc = -28" --> -ENOSPC).
>>
>> You most likely have, at some point, registered a changelog user on 
>> your mds and that user is not consuming changelogs.
>>
>> You can check this by running:
>>
>> [mds0]# lctl get_param mdd.*.changelog_users
>> mdd.lustre-MDT0000.changelog_users=
>> current index: 3
>> ID    index
>> cl1   0
>>
>> The most important thing to look for is the distance between "current 
>> index" and the index for "cl1", "cl2", ...
>> I expect for at least one changelog user, that distance is 2^32 (the 
>> maximum number of changelog records).
>> Note that changelog indexes wrap around (0, 1, 2, ..., 4294967295, 0, 
>> 1, ...).
>>
>> If I am right, then you can either deregister the changelog user:
>>
>> [mds0]# lctl --device lustre-MDT0000 changelog_deregister cl1
>>
>> or acknowledge the records:
>>
>> [client]# lfs changelog_clear lustre-MDT0000 cl1 0
>>
>> (clearing with index 0 is a shortcut for "acknowledge every changelog 
>> records")
>>
>> Both those options may take a while.
>>
>> There is a third one that might yield faster result, but it is also 
>> much more dangerous to use (you might want to check with your support 
>> first) :
>>
>> [mds0]# umount /dev/mdt0
>> [mds0]# mount -t ldiskfs /dev/mdt0 /mnt/lustre-mdt0
>> [mds0]# rm /mnt/lustre-mdt0/changelog_catalog
>> [mds0]# rm /mnt/lustre-mdt0/changelog_users
>> [mds0]# umount /dev/mdt0
>> [mds0]# mount -t lustre /dev/mdt0 <...> # remount the mdt where it was
>>
>> *I cannot garantee this will not trash your filesystem. Use at your 
>> own risk.
>> *
>>
>> ---
>>
>> In recent versions (2.12, maybe even 2.10), lustre comes with a 
>> builtin garbage collector for slow/inactive changelog users.
>>
>> Regards,
>> Quentin Bouget
>>
>
> Hello Quentin,
>
> Many thanks for your quick reply.
>
> This is what I got when I issued the command you suggested:
>
> [root at lustre-mds]# lctl get_param mdd.*.changelog_users
> mdd.lustre-MDT0000.changelog_users=
> current index: 4160462682
> ID    index
> cl1   21020582
>
> I then issued the following command:
> [root at lustre-mds]# lctl --device lustre-MDT0000 changelog_deregister cl1
>
> It's been running for almost 20 hours now. Do you have an estimation 
> of the time it could take ?
When you deregister a changelog user: every changelog record has to be 
invalidated (maybe this is batched, but I don't know enough about the 
on-disk structure to say).

I do not recall ever waiting that long. Then again, I never personally 
deregistered a changelog users with that many pending changelog records.

If you just want to make sure Lustre is doing something, you can have a 
look at your mdt0: invalidating changelog records should generate a high 
load of small random writes.
If the device is idle, something is probably wrong.

Is your filesystem still unavailable?

>
> Best,
> -- 
> Julien REY
>
> Plate-forme RPBS
> Molécules Thérapeutiques In Silico (MTi)
> Université Paris Diderot - Paris VII
> tel : 01 57 27 83 95
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Quentin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20181211/9900c0b1/attachment.html>