[lustre-discuss] ?==?utf-8?q? ?==?utf-8?q? ?= Changelog users failing to clear records in 2.8, can anyone help
nanava at luis.uni-hannover.de
Mon Jan 7 10:39:51 PST 2019
Any advice here what could be done to avoid storage overflow?. As only a few gb is left on mdt, we plan to resize the storage device, though this seems to be not a permanent solution ..
> has this problem been resolved? We experience the same issue when we can't clear the changelog records and as a result MDT is gradually running out of space.
> Will update to 2.10.x on MDS resolve the issue ?
> Gizo Nanava
> Leibniz Universitaet IT Services
> Leibniz Universitaet Hannover
> Schlosswender Str. 5
> D-30159 Hannover
> Tel +49 511 762 7919085
> On Thursday, June 1, 2017 16:09 CEST, "Gibbins, Faye" <Faye.Gibbins at cirrus.com> wrote:
> > Hi,
> > We have 4 file systems on our lustre cluster. All have changelog users registered for robinhood to use.
> > We have discovered that a changelog user for one of the file systems is not catching up to its index. Manual runs of Robinhood fail to read any more records even though according to mdd/tools-MDT0000/changelog_users there are record to read!
> > Over time the change log had filled and the file system had become sluggish. Wiping the robinhood mysql and reinitializing robin hood with a full scan didn't fix the issue and like I said above three other change logs from different file systems (on the same MSG) are ok when used from the same robinhood instance.
> > What makes me think this is a lustre (and we are using 2.8 on ext4) problem is this (repeated) error we are getting in syslog:
> > [Wed May 31 14:06:59 2017] Lustre: 46400:0:(llog.c:530:llog_process_thread()) invalid length -420090294 in llog record for index 372672342/61708
> > [Wed May 31 14:06:59 2017] LustreError: 46400:0:(mdd_device.c:261:llog_changelog_cancel()) tools-MDD0000: cancel idx 645 of catalog 0x7:10 rc=-22
> > Deregistering the user from the change log and starting with a new one has not changed the behaviour and we still can't use this new user to track changes to the file system.
> > Can anyone offer any advice on how to resolve this issue in the changelog?
> > If not can anyone confirm if taking the file system down for a e2fsck/lfsck will fix issues with the changelog? I'd settle for being able to clear the whole log and starting afresh if that's possible?
> > Yours
> > Faye Gibbins
> > Snr SysAdmin, Unix Lead Architect
> > Software Systems and Cloud Services
> > Cirrus Logic | cirrus.com<http://www.cirrus.com/> | +44 (0) 131 272 7398
> > [cid:image002.png at 01D2CF24.9A35B8F0]
> > This message and any attachments may contain privileged and confidential information that is intended solely for the person(s) to whom it is addressed. If you are not an intended recipient you must not: read; copy; distribute; discuss; take any action in or make any reliance upon the contents of this message; nor open or read any attachment. If you have received this message in error, please notify us as soon as possible on the following telephone number and destroy this message including any attachments. Thank you. Cirrus Logic International (UK) Ltd and Cirrus Logic International Semiconductor Ltd are companies registered in Scotland, with registered numbers SC089839 and SC495735 respectively. Our registered office is at 7B Nightingale Way, Quartermile, Edinburgh, EH3 9EG, UK. Tel: +44 (0)131 272 7000. cirrus.com
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
Dr. Gizo Nanava
Leibniz Universitaet IT Services
Leibniz Universitaet Hannover
Schlosswender Str. 5
Tel +49 511 762 7919085
More information about the lustre-discuss