<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css" style="display:none"><!--P{margin-top:0;margin-bottom:0;} p
{margin-top:0;
margin-bottom:0}
@font-face
{font-family:"Cambria Math"}
@font-face
{font-family:Calibri}
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman"}
a:link, span.MsoHyperlink
{color:#0563C1;
text-decoration:underline}
a:visited, span.MsoHyperlinkFollowed
{color:#954F72;
text-decoration:underline}
p.MsoQuote, li.MsoQuote, div.MsoQuote
{margin-top:10.0pt;
margin-right:.6in;
margin-bottom:8.0pt;
margin-left:.6in;
text-align:center;
font-size:12.0pt;
font-family:"Times New Roman";
color:#404040;
font-style:italic}
span.QuoteChar
{font-family:Calibri;
color:#404040;
font-style:italic}
span.EmailStyle21
{font-family:Calibri;
color:windowtext}
span.msoIns
{text-decoration:underline;
color:teal}
.MsoChpDefault
{font-size:10.0pt}
@page WordSection1
{margin:1.0in 1.0in 1.0in 1.0in}--></style>
</head>
<body dir="ltr" style="font-size:12pt;color:#000000;background-color:#FFFFFF;font-family:Calibri,Arial,Helvetica,sans-serif;">
<p><br>
</p>
<p><br>
</p>
<p>Thanks, Cory. We are still running 2.5.3.90, which doesn't have that fix. That patch looks like it would solve our slow-to-mount MDT. FWIW, I don't think we have many (any?) empty plain llogs, but the removal of the llog_process_or_fork() call in llog_cat_init_and_process()
looks like it addresses our issue - I see that in the stack of the osp-syn-* threads when the MDT is being read like crazy during mounts.
</p>
<p><br>
</p>
As a followup - is there any reason *not* to unmount the MDT, mount it as ldiskfs, and simply delete the plain llogs in our MDT's O/1/d* folders that contain only CHANGELOG_REC records? Or even every file under the MDT's O/1/d* folders? I'm a little unsure. It
seems that most of (if not all of) the files there now are just taking up space, and nothing else is going to remove them. <br>
<br>
FWIW, our intent is to start using changelogs and robinhood again after we upgrade to a later version of Lustre than what we are currently running, at which time we'll just start over - register new changelog users and rescan the whole filesystem. We won't
care about any prior history. <br>
<p><br>
</p>
<p>Thanks again,</p>
<p>Craig<br>
</p>
<p><br>
</p>
<div style="color:rgb(33,33,33)">
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>From:</b> Cory Spitz <spitzcor@cray.com><br>
<b>Sent:</b> Monday, December 5, 2016 5:30 PM<br>
<b>To:</b> Prescott,Craig P; lustre-discuss@lists.lustre.org<br>
<b>Subject:</b> Re: [lustre-discuss] Changelog record cleanup in /O/1/d*</font>
<div> </div>
</div>
<div>
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt; font-family:Calibri">Craig, FWIW, this sounds a lot like
<a href="https://jira.hpdd.intel.com/browse/LU-5038">https://jira.hpdd.intel.com/browse/LU-5038</a>, which was addressed in 2.7.0.</span></p>
<p class="MsoNormal"><span style="font-size:11.0pt; font-family:Calibri">-Cory</span></p>
<p class="MsoNormal"><span style="font-size:11.0pt; font-family:Calibri"> </span></p>
<div>
<div>
<div>
<p class="MsoNormal">-- </p>
</div>
</div>
</div>
<p class="MsoNormal"><span style="font-size:11.0pt; font-family:Calibri"> </span></p>
<p class="MsoNormal"><span style="font-size:11.0pt; font-family:Calibri"> </span></p>
<div style="border:none; border-top:solid #B5C4DF 1.0pt; padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-family:Calibri; color:black">From: </span>
</b><span style="font-family:Calibri; color:black">lustre-discuss <lustre-discuss-bounces@lists.lustre.org> on behalf of "Prescott,Craig P" <prescott@rc.ufl.edu><br>
<b>Date: </b>Monday, December 5, 2016 at 3:02 PM<br>
<b>To: </b>"lustre-discuss@lists.lustre.org" <lustre-discuss@lists.lustre.org><br>
<b>Subject: </b>[lustre-discuss] Changelog record cleanup in /O/1/d*</span></p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<p> </p>
<p>We were running 2.5.3.90 with changelogs enabled earlier this summer. We ran into a catalog corruption issue (<span class="quote1">LU-6556) - we decided to deregister our changelog users, move the
</span>CONFIGS/changelog_{catalog,users} files out of the way, and carry on until we had an opportunity to upgrade. We did not remove anything from /O/1/d* at that time (though we probably should have).</p>
<p> </p>
<p>We've observed that mounting our MDT can take several-to-many minutes - I can see with iostat that the MDT is very busy with reads while it is being mounted. I suspect that those stale files in /O/1/d* are the reason (there are lots of them), as they are
processed by the OSP sync at MDT startup. I looked with debugfs at the /O/1/d* directories - there are 1000s of files and their timestamps are consistent with when we were using changelogs. I dumped a few randomly selected ones and checked with llog_reader
that the types of records they contain are CHANGELOG_REC (type=10660000). </p>
<p> </p>
<p>At the least, I think we should to remove the files in /O/1/d* that contain CHANGELOG_REC entries. Can I just delete every file in /O/1/d*, or do I need to be careful and only remove the CHANGELOG_REC entries?
</p>
<p> </p>
<p>The reason I ask is that I do see a handful of files that are not changelog-related in these directories - their timestamps are newer and their record type as reported by llog_reader is not CHANGELOG_REC or CHANGELOG_USER. There are only a small number
of such files, though.</p>
<p> </p>
<p>Thanks,</p>
<p>Craig Prescott</p>
<p class="MsoNormal">University of Florida Research Computing </p>
</div>
</div>
</div>
</body>
</html>