<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p>I think that may be a red herring related to rsyslog? When we
most recently rebooted the MDT, this is the log (still on the box,
not on the log server):</p>
<p>May 3 14:24:22 asimov kernel: LNet: HW CPU cores: 12,
npartitions: 4<br>
May 3 14:24:30 asimov kernel: LNet: Added LNI 10.7.17.8@o2ib
[8/256/0/180]<br>
</p>
And lctl list_nids gives it once:<br>
<br>
[root@asimov ~]# lctl list_nids<br>
10.7.17.8@o2ib<br>
<br>
Jessica<br>
<br>
<div class="moz-cite-prefix">On 5/19/17 10:13 AM, Jeff Johnson
wrote:<br>
</div>
<blockquote
cite="mid:CAFCYAsdK0NvbzVVBNNzJh+WT=dbyiT18XbMVdLoVu7R0iBjTMg@mail.gmail.com"
type="cite">
<div>Jessica,</div>
<div><br>
</div>
<div>You are getting a NID registering twice. Doug noticed and
pointed it out. I'd look to see if that is one machine doing
something twice or two machines with the same NID.</div>
<div><br>
</div>
<div>--Jeff </div>
<div><br>
<div class="gmail_quote">
<div>On Fri, May 19, 2017 at 05:58 Ms. Megan Larko <<a
moz-do-not-send="true" href="mailto:dobsonunit@gmail.com">dobsonunit@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<div>Greetings Jessica,</div>
<div><br>
</div>
<div>I'm not sure I am correctly understanding the
behavior "robinhood activity floods the MDT". The
robinhood program as you (and I) are using it is
consuming the MDT CHANGELOG via a reader_id which was
assigned when the CHANGELOG was enabled on the MDT.
You can check the MDS for these readers via "lctl
get_param mdd.*.changelog_users". Each CHANGELOG reader
must either be consumed by a process or destroyed
otherwise the CHANGELOG will grow until it consumes
sufficient space to stop the MDT from functioning
correctly. So robinhood should consume and then clear
the CHANGELOG via this reader_id. This implementation
of robinhood is actually a rather light-weight process
as far as the MDS is concerned. The load issues I
encountered were on the robinhood server itself which is
a separate server from the Lustre MGS/MDS server.</div>
<div><br>
</div>
<div>Just curious, have you checked for multiple
reader_id's on your MDS for this Lustre file system?</div>
<div><br>
</div>
<div>P.S. My robinhood configuration file is using
nb_threads = 8, just for a data point.</div>
<div><br>
</div>
<div>Cheers,</div>
<div>megan</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Thu, May 18, 2017 at 2:36 PM,
Jessica Otey <span><<a moz-do-not-send="true"
href="mailto:jotey@nrao.edu" target="_blank">jotey@nrao.edu</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<p>Hi Megan,<br>
</p>
<p>Thanks for your input. We use percona, a drop-in
replacement for mysql... The robinhood activity
floods the MDT, but it does not seem to produce
any excessive load on the robinhood box...</p>
<p>Anyway, FWIW...<br>
</p>
<p>~]# mysql --version<br>
mysql Ver 14.14 Distrib 5.5.54-38.6, for Linux
(x86_64) using readline 5.1<br>
</p>
<p>Product: robinhood<br>
Version: 3.0-1<br>
Build: 2017-03-13 10:29:26<br>
<br>
Compilation switches:<br>
Lustre filesystems<br>
Lustre Version: 2.5<br>
Address entries by FID<br>
MDT Changelogs supported<br>
<br>
Database binding: MySQL<br>
</p>
<p>RPM: robinhood-lustre-3.0-1.lustre2.5.el6.x86_64</p>
Lustre rpms:<br>
<p>lustre-client-2.5.5-2.6.32_642.15.1.el6.x86_64_g22a210f.x86_64<br>
lustre-client-modules-2.5.5-2.6.32_642.15.1.el6.x86_64_g22a210f.x86_64<br>
</p>
<br>
<div
class="m_8402669983453079809m_-8956838098082579714moz-cite-prefix">On
5/18/17 11:55 AM, Ms. Megan Larko wrote:<br>
</div>
<blockquote type="cite">
<div>
<div>With regards to (WRT) Subject "Robinhood
exhausting RPC resources against 2.5.5
lustre file systems", what version of
robinhood and what version of MySQL
database? I mention this because I have been
working with robinhood-3.0-0.rc1 and initially
MySQL-5.5.32 and Lustre 2.5.42.1 on
kernel-2.6.32-573 and had issues in which the
robinhood server consumed more than the total
amount of 32 CPU cores on the robinhood server
(with 128 G RAM) and would functionally hang
the robinhood server. The issue was solved
for me by changing to MySQL-5.6.35. It was
the "sort" command in robinhood that was not
working well with the MySQL-5.5.32.</div>
<div><br>
</div>
<div>Cheers,</div>
<div>megan</div>
</div>
<div class="gmail_extra"><br>
</div>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br>
</div>
_______________________________________________<br>
lustre-discuss mailing list<br>
<a moz-do-not-send="true"
href="mailto:lustre-discuss@lists.lustre.org"
target="_blank">lustre-discuss@lists.lustre.org</a><br>
<a moz-do-not-send="true"
href="http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org"
rel="noreferrer" target="_blank">http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org</a><br>
</blockquote>
</div>
</div>
<div dir="ltr">-- <br>
</div>
<div data-smartmail="gmail_signature">
<div dir="ltr">------------------------------<br>
Jeff Johnson<br>
Co-Founder<br>
Aeon Computing<br>
<br>
<a moz-do-not-send="true"
href="mailto:jeff.johnson@aeoncomputing.com" target="_blank">jeff.johnson@aeoncomputing.com</a><br>
<a moz-do-not-send="true" href="http://www.aeoncomputing.com"
target="_blank">www.aeoncomputing.com</a><br>
t: 858-412-3810 x1001 f: 858-412-3845<br>
m: 619-204-9061<br>
<br>
4170 Morena Boulevard, Suite D - San Diego, CA 92117
<div><br>
</div>
<div>High-Performance Computing / Lustre Filesystems /
Scale-out Storage</div>
</div>
</div>
</blockquote>
<br>
</body>
</html>