[lustre-discuss] Robinhood exhausting RPC resources against 2.5.5 lustre file systems

Jeff Johnson jeff.johnson at aeoncomputing.com
Fri May 19 07:13:02 PDT 2017


Jessica,

You are getting a NID registering twice. Doug noticed and pointed it out.
I'd look to see if that is one machine doing something twice or two
machines with the same NID.

--Jeff

On Fri, May 19, 2017 at 05:58 Ms. Megan Larko <dobsonunit at gmail.com> wrote:

> Greetings Jessica,
>
> I'm not sure I am correctly understanding the behavior "robinhood activity
> floods the MDT".   The robinhood program as you (and I) are using it is
> consuming the MDT CHANGELOG via a reader_id which was assigned when the
> CHANGELOG was enabled on the MDT.   You can check the MDS for these readers
> via "lctl get_param mdd.*.changelog_users".  Each CHANGELOG reader must
> either be consumed by a process or destroyed otherwise the CHANGELOG will
> grow until it consumes sufficient space to stop the MDT from functioning
> correctly.  So robinhood should consume and then clear the CHANGELOG via
> this reader_id.  This implementation of robinhood is actually a rather
> light-weight process as far as the MDS is concerned.   The load issues I
> encountered were on the robinhood server itself which is a separate server
> from the Lustre MGS/MDS server.
>
> Just curious, have you checked for multiple reader_id's on your MDS for
> this Lustre file system?
>
> P.S. My robinhood configuration file is using nb_threads = 8, just for a
> data point.
>
> Cheers,
> megan
>
> On Thu, May 18, 2017 at 2:36 PM, Jessica Otey <jotey at nrao.edu> wrote:
>
>> Hi Megan,
>>
>> Thanks for your input. We use percona, a drop-in replacement for mysql...
>> The robinhood activity floods the MDT, but it does not seem to produce any
>> excessive load on the robinhood box...
>>
>> Anyway, FWIW...
>>
>> ~]# mysql --version
>> mysql  Ver 14.14 Distrib 5.5.54-38.6, for Linux (x86_64) using readline
>> 5.1
>>
>> Product:         robinhood
>> Version:         3.0-1
>> Build:           2017-03-13 10:29:26
>>
>> Compilation switches:
>>     Lustre filesystems
>>     Lustre Version: 2.5
>>     Address entries by FID
>>     MDT Changelogs supported
>>
>> Database binding: MySQL
>>
>> RPM: robinhood-lustre-3.0-1.lustre2.5.el6.x86_64
>> Lustre rpms:
>>
>> lustre-client-2.5.5-2.6.32_642.15.1.el6.x86_64_g22a210f.x86_64
>> lustre-client-modules-2.5.5-2.6.32_642.15.1.el6.x86_64_g22a210f.x86_64
>>
>> On 5/18/17 11:55 AM, Ms. Megan Larko wrote:
>>
>> With regards to (WRT) Subject "Robinhood exhausting RPC resources against
>> 2.5.5   lustre file systems", what version of robinhood and what version of
>> MySQL database?   I mention this because I have been working with
>> robinhood-3.0-0.rc1 and initially MySQL-5.5.32 and Lustre 2.5.42.1 on
>> kernel-2.6.32-573 and had issues in which the robinhood server consumed
>> more than the total amount of 32 CPU cores on the robinhood server (with
>> 128 G RAM) and would functionally hang the robinhood server.   The issue
>> was solved for me by changing to MySQL-5.6.35.   It was the "sort" command
>> in robinhood that was not working well with the MySQL-5.5.32.
>>
>> Cheers,
>> megan
>>
>>
>>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
-- 
------------------------------
Jeff Johnson
Co-Founder
Aeon Computing

jeff.johnson at aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x1001   f: 858-412-3845
m: 619-204-9061

4170 Morena Boulevard, Suite D - San Diego, CA 92117

High-Performance Computing / Lustre Filesystems / Scale-out Storage
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20170519/39ac593c/attachment.htm>


More information about the lustre-discuss mailing list