[lustre-discuss] ?= Changelog users failing to clear records in 2.8, can anyone help

Colin Faber cfaber at gmail.com
Thu Jan 24 22:16:02 PST 2019


Have you tried manually purging the changelog files and catalog then
restarting by re-registering? Also, are you sure that _all_ consumers are
requesting to clear the records?

On Mon, Jan 7, 2019 at 11:40 AM nanava at luis.uni-hannover.de <
nanava at luis.uni-hannover.de> wrote:

>
> Any advice here what could be done to avoid storage overflow?. As only a
> few gb is left on mdt, we plan to resize the storage  device, though this
> seems to be not a permanent solution ..
>
> Thank you.
>
> >
> > has this problem been resolved? We experience the same issue when we
> can't clear the changelog records and as a result MDT is gradually running
> out of space.
> > Will update to 2.10.x on MDS resolve the issue ?
> >
> > Thanks.
> >
> > Regards
> > -----
> > Gizo Nanava
> > Leibniz Universitaet IT Services
> > Leibniz Universitaet Hannover
> > Schlosswender Str. 5
> > D-30159 Hannover
> > Tel +49 511 762 7919085
> > http://www.luis.uni-hannover.de
> >
> > On Thursday, June 1, 2017 16:09 CEST, "Gibbins, Faye" <
> Faye.Gibbins at cirrus.com> wrote:
> >
> > > Hi,
> > >
> > > We have 4 file systems on our lustre cluster. All have changelog users
> registered for robinhood to use.
> > >
> > > We have discovered that a changelog user for one of the file systems
> is not catching up to its index. Manual runs of Robinhood fail to read any
> more records even though according to mdd/tools-MDT0000/changelog_users
> there are record to read!
> > >
> > > Over time the change log had filled and the file system had become
> sluggish. Wiping the robinhood mysql and reinitializing robin hood with a
> full scan didn't fix the issue and like I said above three other change
> logs from different file systems (on the same MSG) are ok when used from
> the same robinhood instance.
> > >
> > > What makes me think this is a lustre (and we are using 2.8 on ext4)
> problem is this (repeated) error we are getting in syslog:
> > >
> > > [Wed May 31 14:06:59 2017] Lustre:
> 46400:0:(llog.c:530:llog_process_thread()) invalid length -420090294 in
> llog record for index 372672342/61708
> > > [Wed May 31 14:06:59 2017] LustreError:
> 46400:0:(mdd_device.c:261:llog_changelog_cancel()) tools-MDD0000: cancel
> idx 645 of catalog 0x7:10 rc=-22
> > >
> > > Deregistering the user from the change log and starting with a new one
> has not changed the behaviour and we still can't use this new user to track
> changes to the file system.
> > >
> > > Can anyone offer any advice on how to resolve this issue in the
> changelog?
> > > If not can anyone confirm if taking the file system down for a
> e2fsck/lfsck will fix issues with the changelog? I'd settle for being able
> to clear the whole log and starting afresh if that's possible?
> > >
> > > Yours
> > > Faye Gibbins
> > > Snr SysAdmin, Unix Lead Architect
> > > Software Systems and Cloud Services
> > > Cirrus Logic | cirrus.com<http://www.cirrus.com/>  | +44 (0) 131 272
> 7398
> > >
> > > [cid:image002.png at 01D2CF24.9A35B8F0]
> > >
> > > This message and any attachments may contain privileged and
> confidential information that is intended solely for the person(s) to whom
> it is addressed. If you are not an intended recipient you must not: read;
> copy; distribute; discuss; take any action in or make any reliance upon the
> contents of this message; nor open or read any attachment. If you have
> received this message in error, please notify us as soon as possible on the
> following telephone number and destroy this message including any
> attachments. Thank you. Cirrus Logic International (UK) Ltd and Cirrus
> Logic International Semiconductor Ltd are companies registered in Scotland,
> with registered numbers SC089839 and SC495735 respectively. Our registered
> office is at 7B Nightingale Way, Quartermile, Edinburgh, EH3 9EG, UK. Tel:
> +44 (0)131 272 7000. cirrus.com
> >
> >
> >
> > _______________________________________________
> > lustre-discuss mailing list
> > lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
>
> --
> _______________________
> Dr. Gizo Nanava
> Leibniz Universitaet IT Services
> Leibniz Universitaet Hannover
> Schlosswender Str. 5
> D-30159 Hannover
> Tel +49 511 762 7919085
> http://www.luis.uni-hannover.de
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20190124/8efa70af/attachment.html>


More information about the lustre-discuss mailing list