[Lustre-discuss] ldlm_locks memory usage crashes OSS

Wed Sep 26 10:15:45 PDT 2012

So the max number is 800, and the maximum age is 36000000 (don't know
the unit, but it looks like an hour).
We have only two OST on each of our two OSS.
With 80 clients, do you think there is any chance these settings are too high?

2012/9/26 Mohr Jr, Richard Frank (Rick Mohr) <rmohr at utk.edu>:
>
> On Sep 26, 2012, at 11:34 AM, Jérémie Dubois-Lacoste wrote:
>
>> We have typically ~80 clients running.
>> In our case the locks were eating memory on the oss, not the mds, but
>> yes it might
>> still have the same cause.
>
> Hmmm.  With only 80 clients I wouldn't expect that locks would use up too much memory.  But the number of cached locks is controlled on a per target basis, so if the OSS node has quite a few OSTs, the aggregate number of locks could get large.
>
>> I'm not sure how to find the lifespan and number of locks per node. I
>> think we're using the default settings. Do you know how to check this?
>
> Take a look at /proc/fs/lustre/ldlm/namespaces/*/{lru_max_age,lru_size}. The lru_max_age file shows the amount of time a lock will be cached before it ages out.  The lru_size file shows the max number of locks that will be cached.
>
> --
> Rick Mohr
> Senior HPC System Administrator
> National Institute for Computational Sciences
> http://www.nics.tennessee.edu
>
>