[Lustre-discuss] ldlm_locks memory usage crashes OSS

Wed Sep 26 07:15:03 PDT 2012

How many clients are using your file system?  We had an issue at one point with our MDS running out of memory due to large numbers of locks. I did some digging and found that each client was set to cache 1200 locks (100 per core), and the lifespan for the cached locks was very long (although I can't remember the exact value).  A quick calculation showed that based on our machine size, the MDS did not have enough memory to support this many locks.  These settings had been in use for years, but we never had a problem until a user ran a very large job which opened/closed thousands of files per node.  The number of cached locks would slowly build until the MDS OOM'ed.  We ended up reducing the number of cached locks per client and this resolved the issue.

I don't know if this could be the same type of problem affecting your system, but I thought I would share the details in case it was relevant.

-- 
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu

On Sep 26, 2012, at 6:14 AM, Jérémie Dubois-Lacoste wrote:

> So I still have no idea what was the cause of this, but we shut down the
> entire cluster, rebooted the head node (with the MDS) twice, and things
> were working fine again. Good old method!
> Maybe something was wrong with one or several clients stuck somehow
> in a lock. If it happens again I'll post it.
> Thanks anyway!
> 
> Jérémie
> 
> 
> 2012/9/25 Jérémie Dubois-Lacoste <jeremie.dl at gmail.com>:
>> Hi All,
>> 
>> We have a problem with one of our OSS, crashing out of memory, on a
>> system that we recently re-install.  Our system uses two OSS with two
>> OST on each, running Lustre 2.1.3 on CentOS 6.3 with the kernel
>> 2.6.32-220.17.1.el6_lustre.x86_64 (so, 64bits).
>> 
>> One of the OSS is getting low on memory until it provocs a kernel
>> panic. Checking with 'slabtop', it comes from the memory usage of
>> "ldlm_locks" that keeps growing forever (until it crashes).  The
>> growing rate is rather quick: close to 1Mb per second, so in ~1h it
>> takes it all.
>> 
>> It may be related with the following bug:
>> https://bugzilla.lustre.org/show_bug.cgi?id=19950
>> However, this was for lustre 1.6, so I'm not sure.
>> 
>> I tried rebooting, resyncronizing with the MDS afterwards, the same
>> happends again.  Now that I check the other OSS (the one that is ok)
>> carefully, the same seems to happen but at a very slow growing
>> rate. Not sure yet.
>> 
>> This may be a consequence, or related anyhow:
>> We are using Lustre on a computing cluster with Sun Grid Engine 6.2u5,
>> and any jobs we submit takes a *HUGE* amount of memory compare to what
>> it was needing before our upgrade (and what it takes if we run it
>> directly, not through SGE). If the measure we get from SGE are
>> correct, the difference can be up to x1000: many jobs then get killed.
>> Sorry if it is not the proper place to post this, but I have the
>> intuition that this
>> could be related and some people here could be used to this combination
>> Lustre+SGE.
>> 
>> Any suggestion welcome!
>> 
>> Thanks,
>> 
>>    Jérémie
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>