[lustre-discuss] no more free slots in catalog

Julien Rey julien.rey at univ-paris-diderot.fr
Mon Dec 17 01:40:56 PST 2018


Le 11/12/2018 15:47, quentin.bouget at cea.fr a écrit :
> Le 11/12/2018 à 15:32, Julien Rey a écrit :
>> Le 11/12/2018 14:13, quentin.bouget at cea.fr a écrit :
>>> Le 11/12/2018 à 10:28, Julien Rey a écrit :
>>>> Le 10/12/2018 13:33, quentin.bouget at cea.fr a écrit :
>>>>> Le 10/12/2018 à 12:00, Julien Rey a écrit :
>>>>>> Hello,
>>>>>>
>>>>>> We are running lustre 
>>>>>> 2.8.0-RC5--PRISTINE-2.6.32-573.12.1.el6_lustre.x86_64.
>>>>>>
>>>>>> Since thursday we are getting a "bad address" error when trying 
>>>>>> to write on the lustre volume.
>>>>>>
>>>>>> Looking at the logs on the MDS, we are getting this kind of 
>>>>>> messages :
>>>>>>
>>>>>> Dec 10 06:26:18 localhost kernel: Lustre: 
>>>>>> 9593:0:(llog_cat.c:93:llog_cat_new_log()) lustre-MDD0000: there 
>>>>>> are no more free slots in catalog
>>>>>> Dec 10 06:26:18 localhost kernel: Lustre: 
>>>>>> 9593:0:(llog_cat.c:93:llog_cat_new_log()) Skipped 45157 previous 
>>>>>> similar messages
>>>>>> Dec 10 06:26:18 localhost kernel: LustreError: 
>>>>>> 9593:0:(mdd_dir.c:887:mdd_changelog_ns_store()) lustre-MDD0000: 
>>>>>> cannot store changelog record: type = 6, name = 
>>>>>> 'PEPFOLD-00016_bestene1-mc-SC-min-grompp.log', t = 
>>>>>> [0x20000a58f:0x858e:0x0], p = [0x20000a57d:0x17fd9:0x0]: rc = -28
>>>>>> Dec 10 06:26:18 localhost kernel: LustreError: 
>>>>>> 9593:0:(mdd_dir.c:887:mdd_changelog_ns_store()) Skipped 45157 
>>>>>> previous similar messages
>>>>>>
>>>>>>
>>>>>> I saw here that this issue was supposed to be solved in 2.8.0:
>>>>>> https://jira.whamcloud.com/browse/LU-6556
>>>>>>
>>>>>> Could someone help us unlocking this situation ?
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>> Hello,
>>>>>
>>>>> The log messages don't point at a "bad address" issue but rather 
>>>>> at a "no space left on device" one ("rc = -28" --> -ENOSPC).
>>>>>
>>>>> You most likely have, at some point, registered a changelog user 
>>>>> on your mds and that user is not consuming changelogs.
>>>>>
>>>>> You can check this by running:
>>>>>
>>>>> [mds0]# lctl get_param mdd.*.changelog_users
>>>>> mdd.lustre-MDT0000.changelog_users=
>>>>> current index: 3
>>>>> ID    index
>>>>> cl1   0
>>>>>
>>>>> The most important thing to look for is the distance between 
>>>>> "current index" and the index for "cl1", "cl2", ...
>>>>> I expect for at least one changelog user, that distance is 2^32 
>>>>> (the maximum number of changelog records).
>>>>> Note that changelog indexes wrap around (0, 1, 2, ..., 4294967295, 
>>>>> 0, 1, ...).
>>>>>
>>>>> If I am right, then you can either deregister the changelog user:
>>>>>
>>>>> [mds0]# lctl --device lustre-MDT0000 changelog_deregister cl1
>>>>>
>>>>> or acknowledge the records:
>>>>>
>>>>> [client]# lfs changelog_clear lustre-MDT0000 cl1 0
>>>>>
>>>>> (clearing with index 0 is a shortcut for "acknowledge every 
>>>>> changelog records")
>>>>>
>>>>> Both those options may take a while.
>>>>>
>>>>> There is a third one that might yield faster result, but it is 
>>>>> also much more dangerous to use (you might want to check with your 
>>>>> support first) :
>>>>>
>>>>> [mds0]# umount /dev/mdt0
>>>>> [mds0]# mount -t ldiskfs /dev/mdt0 /mnt/lustre-mdt0
>>>>> [mds0]# rm /mnt/lustre-mdt0/changelog_catalog
>>>>> [mds0]# rm /mnt/lustre-mdt0/changelog_users
>>>>> [mds0]# umount /dev/mdt0
>>>>> [mds0]# mount -t lustre /dev/mdt0 <...> # remount the mdt where it was
>>>>>
>>>>> *I cannot garantee this will not trash your filesystem. Use at 
>>>>> your own risk.
>>>>> *
>>>>>
>>>>> ---
>>>>>
>>>>> In recent versions (2.12, maybe even 2.10), lustre comes with a 
>>>>> builtin garbage collector for slow/inactive changelog users.
>>>>>
>>>>> Regards,
>>>>> Quentin Bouget
>>>>>
>>>>
>>>> Hello Quentin,
>>>>
>>>> Many thanks for your quick reply.
>>>>
>>>> This is what I got when I issued the command you suggested:
>>>>
>>>> [root at lustre-mds]# lctl get_param mdd.*.changelog_users
>>>> mdd.lustre-MDT0000.changelog_users=
>>>> current index: 4160462682
>>>> ID    index
>>>> cl1   21020582
>>>>
>>>> I then issued the following command:
>>>> [root at lustre-mds]# lctl --device lustre-MDT0000 changelog_deregister cl1
>>>>
>>>> It's been running for almost 20 hours now. Do you have an 
>>>> estimation of the time it could take ?
>>> When you deregister a changelog user: every changelog record has to 
>>> be invalidated (maybe this is batched, but I don't know enough about 
>>> the on-disk structure to say).
>>>
>>> I do not recall ever waiting that long. Then again, I never 
>>> personally deregistered a changelog users with that many pending 
>>> changelog records.
>>
>> The changelog_deregister command still hasn't finished yet. Is there 
>> any way to track the state of the purge of records ?
> I believe there is an "llog_reader" implemented in the lustre sources, 
> but I never really used it.
>
>>> If you just want to make sure Lustre is doing something, you can 
>>> have a look at your mdt0: invalidating changelog records should 
>>> generate a high load of small random writes.
>>> If the device is idle, something is probably wrong.
>>
>> Hard to tell. iostat doesn't show much I/O.
>>
>>> Is your filesystem still unavailable?
>>
>> The following command doesn't show any registered changelog user:
>>
>> cat /proc/fs/lustre/mdd/lustre-MDT0000/changelog_users
>>
>> I tried to mount the lustre volume on a client. I don't get the "Bad 
>> Address" error anymore.
>>
>> Best,
> Yes, I don't think you need to wait for "changelog_deregister" to 
> complete to start using your filesystem again.
> If there is not any changelog user registered on your system, 
> changelog records are not emitted.
>
> Just try to wait until "changelog_deregister" completes before 
> registering a new changelog user. =)
>
>>
>>>
>>>>
>>>> Best,
>>>> -- 
>>>> Julien REY
>>>>
>>>> Plate-forme RPBS
>>>> Molécules Thérapeutiques In Silico (MTi)
>>>> Université Paris Diderot - Paris VII
>>>> tel : 01 57 27 83 95
>>>>
>>>>
>>>> _______________________________________________
>>>> lustre-discuss mailing list
>>>> lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>> Quentin
>>
>>
>> -- 
>> Julien REY
>>
>> Plate-forme RPBS
>> Molécules Thérapeutiques In Silico (MTi)
>> Université Paris Diderot - Paris VII
>> tel : 01 57 27 83 95
>>
>>
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
> Cheers,
> Quentin
>

The changelog_deregister command eventually ended with an LBUG after a 
few days. The mds rebooted automatically. But at least it's working fine 
now. :)

Many thanks for your help.

Cheers,

-- 
Julien REY

Plate-forme RPBS
Molécules Thérapeutiques In Silico (MTi)
Université Paris Diderot - Paris VII
tel : 01 57 27 83 95

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20181217/384ddabe/attachment.html>


More information about the lustre-discuss mailing list