[lustre-discuss] LDLM locks not expiring/cancelling

Steve Crusan stevec at dug.com
Tue Jan 7 10:46:19 PST 2020


Thanks Diego, long time no see! I haven't been using NRS TBF.

I think there's a few problems, some of which we were aware of before, but
the lack of lock cancels was causing chaos.

* (Mark lustre_inode_cache as reclaimable)
https://jira.whamcloud.com/browse/LU-12313
* tested on a 2.12.3 client (without patch above), but we were actually
getting lock cancels now

So I think I'll join 2020 and run 2.12.3 and probably add the SUnreclaim
patch to that as well, as it seems simple enough.

Thank you!

~Steve

On Mon, Jan 6, 2020 at 2:33 AM Moreno Diego (ID SIS) <
diego.moreno at id.ethz.ch> wrote:

> Hi Steve,
>
>
>
> I was having a similar problem in the past months where the MDS servers
> would go OOM because of SlabUnreclaim. The root cause has not yet been
> found but we stopped seeing this the day we disabled the NRS TBF (QoS) for
> any LDLM service (just in case you have it enabled). Just in case you have
> it enabled. It would be good to check as well what’s being consumed in the
> slab cache. In our case it was mostly kernel objects and not ldlm.
>
>
>
> Diego
>
>
>
>
>
> *From: *lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on
> behalf of Steve Crusan <stevec at dug.com>
> *Date: *Thursday, 2 January 2020 at 20:25
> *To: *"lustre-discuss at lists.lustre.org" <lustre-discuss at lists.lustre.org>
> *Subject: *[lustre-discuss] LDLM locks not expiring/cancelling
>
>
>
> Hi all,
>
>
>
> We are running into a bizarre situation where we aren't having stale locks
> cancel themselves, and even worse, it seems as if
> ldlm.namespaces.*.lru_size is being ignored.
>
>
>
> For instance, I unmount our Lustre file systems on a client machine, then
> remount. Next, I'll run "lctl set_param ldlm.namespaces.*.lru_max_age=60s,
> lctl set_param ldlm.namespaces.*.lru_size=1024". This (I believe)
> theoretically would only allow 1024 ldlm locks per osc, and then I'd see a
> lot of lock cancels (via ldlm.namespaces.${ost}.pool.stats). We also should
> see cancels if the grant time > lru_max_age.
>
>
>
> We can trigger this simply by running 'find' on the root of our Lustre
> file system, and waiting for awhile. Eventually the clients SUnreclaim
> value bloats to 60-70GB (!!!), and each of our OSTs have 30-40k LRU locks
> (via lock_count). This is early in the process:
>
>
>
> """
>
> ldlm.namespaces.h5-OST003f-osc-ffff8802d8559000.lock_count=2090
> ldlm.namespaces.h5-OST0040-osc-ffff8802d8559000.lock_count=2127
> ldlm.namespaces.h5-OST0047-osc-ffff8802d8559000.lock_count=52
> ldlm.namespaces.h5-OST0048-osc-ffff8802d8559000.lock_count=1962
> ldlm.namespaces.h5-OST0049-osc-ffff8802d8559000.lock_count=1247
> ldlm.namespaces.h5-OST004a-osc-ffff8802d8559000.lock_count=1642
> ldlm.namespaces.h5-OST004b-osc-ffff8802d8559000.lock_count=1340
> ldlm.namespaces.h5-OST004c-osc-ffff8802d8559000.lock_count=1208
> ldlm.namespaces.h5-OST004d-osc-ffff8802d8559000.lock_count=1422
> ldlm.namespaces.h5-OST004e-osc-ffff8802d8559000.lock_count=1244
> ldlm.namespaces.h5-OST004f-osc-ffff8802d8559000.lock_count=1117
> ldlm.namespaces.h5-OST0050-osc-ffff8802d8559000.lock_count=1165
>
> """
>
>
>
> But this will grow over time, and eventually this compute node gets
> evicted from the MDS (after 10 minutes of cancelling locks/hanging). The
> only way we have been able to reduce the slab usage is to drop caches and
> set LRU=clear...but the problem just comes back depending on the workload.
>
>
>
> We are running 2.10.3 client side, 2.10.1 server side. Have there been any
> fixes added into the codebase for 2.10 that we need to apply? This seems to
> be the closest to what we are experiencing:
>
>
>
> https://jira.whamcloud.com/browse/LU-11518
>
>
>
>
>
> PS: I've checked other systems across our cluster, and some of them have
> as many as 50k locks per OST. I am kind of wondering if these locks are
> staying around much longer than the lru_max_age default (65 minutes), but I
> cannot prove that. Is there a good way to translate held locks to fids? I
> have been messing around with lctl set_param debug="XXX" and lctl set_param
> ldlm.namespaces.*.dump_namespace, but I don't feel like I'm getting *all*
> of the locks.
>
>
>
> ~Steve
>


-- 

*Steve Crusan*

Storage Specialist







DownUnder GeoSolutions



16200 Park Row Drive, Suite 100

Houston TX 77084, USA

tel +1 832 582 3221

stevec at dug.com

www.dug.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20200107/09d01a84/attachment.html>


More information about the lustre-discuss mailing list