[lustre-discuss] Robinhood exhausting RPC resources against 2.5.5 lustre file systems

Jeff Johnson jeff.johnson at aeoncomputing.com
Fri May 19 07:13:02 PDT 2017


You are getting a NID registering twice. Doug noticed and pointed it out.
I'd look to see if that is one machine doing something twice or two
machines with the same NID.


On Fri, May 19, 2017 at 05:58 Ms. Megan Larko <dobsonunit at gmail.com> wrote:

> Greetings Jessica,
> I'm not sure I am correctly understanding the behavior "robinhood activity
> floods the MDT".   The robinhood program as you (and I) are using it is
> consuming the MDT CHANGELOG via a reader_id which was assigned when the
> CHANGELOG was enabled on the MDT.   You can check the MDS for these readers
> via "lctl get_param mdd.*.changelog_users".  Each CHANGELOG reader must
> either be consumed by a process or destroyed otherwise the CHANGELOG will
> grow until it consumes sufficient space to stop the MDT from functioning
> correctly.  So robinhood should consume and then clear the CHANGELOG via
> this reader_id.  This implementation of robinhood is actually a rather
> light-weight process as far as the MDS is concerned.   The load issues I
> encountered were on the robinhood server itself which is a separate server
> from the Lustre MGS/MDS server.
> Just curious, have you checked for multiple reader_id's on your MDS for
> this Lustre file system?
> P.S. My robinhood configuration file is using nb_threads = 8, just for a
> data point.
> Cheers,
> megan
> On Thu, May 18, 2017 at 2:36 PM, Jessica Otey <jotey at nrao.edu> wrote:
>> Hi Megan,
>> Thanks for your input. We use percona, a drop-in replacement for mysql...
>> The robinhood activity floods the MDT, but it does not seem to produce any
>> excessive load on the robinhood box...
>> Anyway, FWIW...
>> ~]# mysql --version
>> mysql  Ver 14.14 Distrib 5.5.54-38.6, for Linux (x86_64) using readline
>> 5.1
>> Product:         robinhood
>> Version:         3.0-1
>> Build:           2017-03-13 10:29:26
>> Compilation switches:
>>     Lustre filesystems
>>     Lustre Version: 2.5
>>     Address entries by FID
>>     MDT Changelogs supported
>> Database binding: MySQL
>> RPM: robinhood-lustre-3.0-1.lustre2.5.el6.x86_64
>> Lustre rpms:
>> lustre-client-2.5.5-2.6.32_642.15.1.el6.x86_64_g22a210f.x86_64
>> lustre-client-modules-2.5.5-2.6.32_642.15.1.el6.x86_64_g22a210f.x86_64
>> On 5/18/17 11:55 AM, Ms. Megan Larko wrote:
>> With regards to (WRT) Subject "Robinhood exhausting RPC resources against
>> 2.5.5   lustre file systems", what version of robinhood and what version of
>> MySQL database?   I mention this because I have been working with
>> robinhood-3.0-0.rc1 and initially MySQL-5.5.32 and Lustre on
>> kernel-2.6.32-573 and had issues in which the robinhood server consumed
>> more than the total amount of 32 CPU cores on the robinhood server (with
>> 128 G RAM) and would functionally hang the robinhood server.   The issue
>> was solved for me by changing to MySQL-5.6.35.   It was the "sort" command
>> in robinhood that was not working well with the MySQL-5.5.32.
>> Cheers,
>> megan
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Jeff Johnson
Aeon Computing

jeff.johnson at aeoncomputing.com
t: 858-412-3810 x1001   f: 858-412-3845
m: 619-204-9061

4170 Morena Boulevard, Suite D - San Diego, CA 92117

High-Performance Computing / Lustre Filesystems / Scale-out Storage
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20170519/39ac593c/attachment.htm>

More information about the lustre-discuss mailing list