[lustre-discuss] no more free slots in catalog

quentin.bouget at cea.fr quentin.bouget at cea.fr
Tue Dec 11 06:47:36 PST 2018


Le 11/12/2018 à 15:32, Julien Rey a écrit :
> Le 11/12/2018 14:13, quentin.bouget at cea.fr a écrit :
>> Le 11/12/2018 à 10:28, Julien Rey a écrit :
>>> Le 10/12/2018 13:33, quentin.bouget at cea.fr a écrit :
>>>> Le 10/12/2018 à 12:00, Julien Rey a écrit :
>>>>> Hello,
>>>>>
>>>>> We are running lustre 
>>>>> 2.8.0-RC5--PRISTINE-2.6.32-573.12.1.el6_lustre.x86_64.
>>>>>
>>>>> Since thursday we are getting a "bad address" error when trying to 
>>>>> write on the lustre volume.
>>>>>
>>>>> Looking at the logs on the MDS, we are getting this kind of 
>>>>> messages :
>>>>>
>>>>> Dec 10 06:26:18 localhost kernel: Lustre: 
>>>>> 9593:0:(llog_cat.c:93:llog_cat_new_log()) lustre-MDD0000: there 
>>>>> are no more free slots in catalog
>>>>> Dec 10 06:26:18 localhost kernel: Lustre: 
>>>>> 9593:0:(llog_cat.c:93:llog_cat_new_log()) Skipped 45157 previous 
>>>>> similar messages
>>>>> Dec 10 06:26:18 localhost kernel: LustreError: 
>>>>> 9593:0:(mdd_dir.c:887:mdd_changelog_ns_store()) lustre-MDD0000: 
>>>>> cannot store changelog record: type = 6, name = 
>>>>> 'PEPFOLD-00016_bestene1-mc-SC-min-grompp.log', t = 
>>>>> [0x20000a58f:0x858e:0x0], p = [0x20000a57d:0x17fd9:0x0]: rc = -28
>>>>> Dec 10 06:26:18 localhost kernel: LustreError: 
>>>>> 9593:0:(mdd_dir.c:887:mdd_changelog_ns_store()) Skipped 45157 
>>>>> previous similar messages
>>>>>
>>>>>
>>>>> I saw here that this issue was supposed to be solved in 2.8.0:
>>>>> https://jira.whamcloud.com/browse/LU-6556
>>>>>
>>>>> Could someone help us unlocking this situation ?
>>>>>
>>>>> Thanks.
>>>>>
>>>> Hello,
>>>>
>>>> The log messages don't point at a "bad address" issue but rather at 
>>>> a "no space left on device" one ("rc = -28" --> -ENOSPC).
>>>>
>>>> You most likely have, at some point, registered a changelog user on 
>>>> your mds and that user is not consuming changelogs.
>>>>
>>>> You can check this by running:
>>>>
>>>> [mds0]# lctl get_param mdd.*.changelog_users
>>>> mdd.lustre-MDT0000.changelog_users=
>>>> current index: 3
>>>> ID    index
>>>> cl1   0
>>>>
>>>> The most important thing to look for is the distance between 
>>>> "current index" and the index for "cl1", "cl2", ...
>>>> I expect for at least one changelog user, that distance is 2^32 
>>>> (the maximum number of changelog records).
>>>> Note that changelog indexes wrap around (0, 1, 2, ..., 4294967295, 
>>>> 0, 1, ...).
>>>>
>>>> If I am right, then you can either deregister the changelog user:
>>>>
>>>> [mds0]# lctl --device lustre-MDT0000 changelog_deregister cl1
>>>>
>>>> or acknowledge the records:
>>>>
>>>> [client]# lfs changelog_clear lustre-MDT0000 cl1 0
>>>>
>>>> (clearing with index 0 is a shortcut for "acknowledge every 
>>>> changelog records")
>>>>
>>>> Both those options may take a while.
>>>>
>>>> There is a third one that might yield faster result, but it is also 
>>>> much more dangerous to use (you might want to check with your 
>>>> support first) :
>>>>
>>>> [mds0]# umount /dev/mdt0
>>>> [mds0]# mount -t ldiskfs /dev/mdt0 /mnt/lustre-mdt0
>>>> [mds0]# rm /mnt/lustre-mdt0/changelog_catalog
>>>> [mds0]# rm /mnt/lustre-mdt0/changelog_users
>>>> [mds0]# umount /dev/mdt0
>>>> [mds0]# mount -t lustre /dev/mdt0 <...> # remount the mdt where it was
>>>>
>>>> *I cannot garantee this will not trash your filesystem. Use at your 
>>>> own risk.
>>>> *
>>>>
>>>> ---
>>>>
>>>> In recent versions (2.12, maybe even 2.10), lustre comes with a 
>>>> builtin garbage collector for slow/inactive changelog users.
>>>>
>>>> Regards,
>>>> Quentin Bouget
>>>>
>>>
>>> Hello Quentin,
>>>
>>> Many thanks for your quick reply.
>>>
>>> This is what I got when I issued the command you suggested:
>>>
>>> [root at lustre-mds]# lctl get_param mdd.*.changelog_users
>>> mdd.lustre-MDT0000.changelog_users=
>>> current index: 4160462682
>>> ID    index
>>> cl1   21020582
>>>
>>> I then issued the following command:
>>> [root at lustre-mds]# lctl --device lustre-MDT0000 changelog_deregister cl1
>>>
>>> It's been running for almost 20 hours now. Do you have an estimation 
>>> of the time it could take ?
>> When you deregister a changelog user: every changelog record has to 
>> be invalidated (maybe this is batched, but I don't know enough about 
>> the on-disk structure to say).
>>
>> I do not recall ever waiting that long. Then again, I never 
>> personally deregistered a changelog users with that many pending 
>> changelog records.
>
> The changelog_deregister command still hasn't finished yet. Is there 
> any way to track the state of the purge of records ?
I believe there is an "llog_reader" implemented in the lustre sources, 
but I never really used it.

>> If you just want to make sure Lustre is doing something, you can have 
>> a look at your mdt0: invalidating changelog records should generate a 
>> high load of small random writes.
>> If the device is idle, something is probably wrong.
>
> Hard to tell. iostat doesn't show much I/O.
>
>> Is your filesystem still unavailable?
>
> The following command doesn't show any registered changelog user:
>
> cat /proc/fs/lustre/mdd/lustre-MDT0000/changelog_users
>
> I tried to mount the lustre volume on a client. I don't get the "Bad 
> Address" error anymore.
>
> Best,
Yes, I don't think you need to wait for "changelog_deregister" to 
complete to start using your filesystem again.
If there is not any changelog user registered on your system, changelog 
records are not emitted.

Just try to wait until "changelog_deregister" completes before 
registering a new changelog user. =)

>
>>
>>>
>>> Best,
>>> -- 
>>> Julien REY
>>>
>>> Plate-forme RPBS
>>> Molécules Thérapeutiques In Silico (MTi)
>>> Université Paris Diderot - Paris VII
>>> tel : 01 57 27 83 95
>>>
>>>
>>> _______________________________________________
>>> lustre-discuss mailing list
>>> lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>> Quentin
>
>
> -- 
> Julien REY
>
> Plate-forme RPBS
> Molécules Thérapeutiques In Silico (MTi)
> Université Paris Diderot - Paris VII
> tel : 01 57 27 83 95
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Cheers,
Quentin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20181211/375bd2d8/attachment.html>


More information about the lustre-discuss mailing list