[lustre-discuss] ?= Changelog users failing to clear records in 2.8, can anyone help

Arman Khalatyan arm2arm at gmail.com
Fri Jan 25 11:55:57 PST 2019


I am no sure if you hit the same bug as in our case: the llog was not
cleared several times and filed the whole mdt space, but  the upgrade from
2.8.x to 2.9 resolved the log clear problem.



Am Fr., 25. Jan. 2019, 07:16 hat Colin Faber <cfaber at gmail.com> geschrieben:

> Have you tried manually purging the changelog files and catalog then
> restarting by re-registering? Also, are you sure that _all_ consumers are
> requesting to clear the records?
>
> On Mon, Jan 7, 2019 at 11:40 AM nanava at luis.uni-hannover.de <
> nanava at luis.uni-hannover.de> wrote:
>
>>
>> Any advice here what could be done to avoid storage overflow?. As only a
>> few gb is left on mdt, we plan to resize the storage  device, though this
>> seems to be not a permanent solution ..
>>
>> Thank you.
>>
>> >
>> > has this problem been resolved? We experience the same issue when we
>> can't clear the changelog records and as a result MDT is gradually running
>> out of space.
>> > Will update to 2.10.x on MDS resolve the issue ?
>> >
>> > Thanks.
>> >
>> > Regards
>> > -----
>> > Gizo Nanava
>> > Leibniz Universitaet IT Services
>> > Leibniz Universitaet Hannover
>> > Schlosswender Str. 5
>> > D-30159 Hannover
>> > Tel +49 511 762 7919085
>> > http://www.luis.uni-hannover.de
>> >
>> > On Thursday, June 1, 2017 16:09 CEST, "Gibbins, Faye" <
>> Faye.Gibbins at cirrus.com> wrote:
>> >
>> > > Hi,
>> > >
>> > > We have 4 file systems on our lustre cluster. All have changelog
>> users registered for robinhood to use.
>> > >
>> > > We have discovered that a changelog user for one of the file systems
>> is not catching up to its index. Manual runs of Robinhood fail to read any
>> more records even though according to mdd/tools-MDT0000/changelog_users
>> there are record to read!
>> > >
>> > > Over time the change log had filled and the file system had become
>> sluggish. Wiping the robinhood mysql and reinitializing robin hood with a
>> full scan didn't fix the issue and like I said above three other change
>> logs from different file systems (on the same MSG) are ok when used from
>> the same robinhood instance.
>> > >
>> > > What makes me think this is a lustre (and we are using 2.8 on ext4)
>> problem is this (repeated) error we are getting in syslog:
>> > >
>> > > [Wed May 31 14:06:59 2017] Lustre:
>> 46400:0:(llog.c:530:llog_process_thread()) invalid length -420090294 in
>> llog record for index 372672342/61708
>> > > [Wed May 31 14:06:59 2017] LustreError:
>> 46400:0:(mdd_device.c:261:llog_changelog_cancel()) tools-MDD0000: cancel
>> idx 645 of catalog 0x7:10 rc=-22
>> > >
>> > > Deregistering the user from the change log and starting with a new
>> one has not changed the behaviour and we still can't use this new user to
>> track changes to the file system.
>> > >
>> > > Can anyone offer any advice on how to resolve this issue in the
>> changelog?
>> > > If not can anyone confirm if taking the file system down for a
>> e2fsck/lfsck will fix issues with the changelog? I'd settle for being able
>> to clear the whole log and starting afresh if that's possible?
>> > >
>> > > Yours
>> > > Faye Gibbins
>> > > Snr SysAdmin, Unix Lead Architect
>> > > Software Systems and Cloud Services
>> > > Cirrus Logic | cirrus.com<http://www.cirrus.com/>  | +44 (0) 131 272
>> 7398
>> > >
>> > > [cid:image002.png at 01D2CF24.9A35B8F0]
>> > >
>> > > This message and any attachments may contain privileged and
>> confidential information that is intended solely for the person(s) to whom
>> it is addressed. If you are not an intended recipient you must not: read;
>> copy; distribute; discuss; take any action in or make any reliance upon the
>> contents of this message; nor open or read any attachment. If you have
>> received this message in error, please notify us as soon as possible on the
>> following telephone number and destroy this message including any
>> attachments. Thank you. Cirrus Logic International (UK) Ltd and Cirrus
>> Logic International Semiconductor Ltd are companies registered in Scotland,
>> with registered numbers SC089839 and SC495735 respectively. Our registered
>> office is at 7B Nightingale Way, Quartermile, Edinburgh, EH3 9EG, UK. Tel:
>> +44 (0)131 272 7000. cirrus.com
>> >
>> >
>> >
>> > _______________________________________________
>> > lustre-discuss mailing list
>> > lustre-discuss at lists.lustre.org
>> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>>
>>
>> --
>> _______________________
>> Dr. Gizo Nanava
>> Leibniz Universitaet IT Services
>> Leibniz Universitaet Hannover
>> Schlosswender Str. 5
>> D-30159 Hannover
>> Tel +49 511 762 7919085
>> http://www.luis.uni-hannover.de
>>
>>
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20190125/ec6be818/attachment.html>


More information about the lustre-discuss mailing list