[lustre-discuss] Lustre ldlm_lockd errors

Riccardo Veraldi Riccardo.Veraldi at cnaf.infn.it
Fri May 4 14:05:14 PDT 2018


I did not see any comments on this.
it is quite a serious problem for us.
I Went back to our setup and wiped away everything to install and
configure a classic standard setup, stock Lustre kernel with Lustre
2.10.3/ldiskfs
on servers and on clients with RHEL74.
the problem still does occur.
It is easy to replicate it writing a big file and then stat() over it
askign for fiel size from lustre clients.

client side:

May  3 12:14:43 drp-tst-acc04 kernel: LustreError: 167-0:
drplu-OST0001-osc-ffff88203909c800: This client was evicted by
drplu-OST0001; in progress operations using this service will fail.


server side:

May  3 12:14:43 drp-tst-oss10 kernel: LustreError:
22641:0:(ldlm_lockd.c:2365:ldlm_cancel_handler()) ldlm_cancel from
172.21.52.129 at o2ib arrived at 1525374883 with bad export cookie
11386024826466070212
May  3 12:14:43 drp-tst-oss10 kernel: LustreError:
22260:0:(ldlm_lockd.c:334:waiting_locks_callback()) ### lock callback
timer expired after 101s: evicting client at 172.21.52.129 at o2ib  ns:
filter-drplu-OST0001_UUID lock: ffff8803b1a64000/0x9e0349410284284e lrc:
3/0,0 mode: PR/PR res: [0x80:0x0:0x0].0x0 rrc: 3 type: EXT
[0->62927998975] (req 62720000000->62736


seems like a infiniband network problem. Anyway it does not occur if I
simply write or read files even saturating Infiniband EDR at 6GB/s
the error does not show up.

Any hints on how could I understand better this issue ?
Could it be that the model of SSD disk I have cannot keep with writing
and reading at the same time to/from the same file ?
the timeouts are really huge.

thanks

Rick




On 4/25/18 2:49 PM, Riccardo Veraldi wrote:
> Hello,
> I am having a quite  serious problem with the lock manager.
> First of all we are using Lustre 2.10.3 both on server and client side
> on RHEL7.
> The only difference beween servers and clients is that lustre OSSes have
> kernel 4.4.126 while clients have stock RHEL7 kernel.
> We have NVMe disks on the OSSes and kernel 4.4 mnages IRQ balancing for
> NVMe disks much better.
> it is possible to reproduce the problem.
> I get this error during the simultaneous read and write. If I run the
> writer and reader sequentially, the problem does not occur and
> everything performs really well.
> Unfortunately we need to write a file and have several threads reading
> from it too.
> So one big file is written and after a while multiple thread readers
> access the file to read data (experimental data). This is the model of
> our DAQ.
> The specific failures are occurring in the read threads when they ask
> for the file size (the call to os.stat() in python).
> This is both to delay the start of the readers until the file exists and
> to keep the reader from deadlocking the writer by repeatedly asking for
> the data at the end of the file.
> I do not know if there is a way to fix this. Apparently seems that
> writing one file and having a bunch of threads reading form the same
> file makes the lock manager unhappy in some way.
> Any hints would be very appreciated. Thank you.
>
> Errors OSS side:
>
> Apr 25 10:31:19 drp-tst-ffb01 kernel: LustreError:
> 0:0:(ldlm_lockd.c:334:waiting_locks_callback()) ### lock callback timer
> expired after 101s: evicting client at 172.21.52.131 at o2ib  ns:
> filter-drpffb-OST0001_UUID lock: ffff88202010b600/0x5be7c3e66a45b63f
> lrc: 3/0,0 mode: PR/PR res: [0x4ad:0x0:0x0].0x0 rrc: 4397 type: EXT
> [0->18446744073709551615] (req 0->18446744073709551615) flags:
> 0x60000400010020 nid: 172.21.52.131 at o2ib remote: 0xc0c93433d781fff9
> expref: 5 pid: 10804 timeout: 4774735450 lvb_type: 1
> Apr 25 10:31:20 drp-tst-ffb01 kernel: LustreError:
> 9524:0:(ldlm_lockd.c:2365:ldlm_cancel_handler()) ldlm_cancel from
> 172.21.52.127 at o2ib arrived at 1524677480 with bad export cookie
> 6622477171464070609
> Apr 25 10:31:20 drp-tst-ffb01 kernel: Lustre: drpffb-OST0001: Connection
> restored to 23bffb9d-10bd-0603-76f6-e2173f99e3c6 (at 172.21.52.127 at o2ib)
> Apr 25 10:31:20 drp-tst-ffb01 kernel: Lustre: Skipped 65 previous
> similar messages
>
>
> Errors client side:
>
> Apr 25 10:31:19 drp-tst-acc06 kernel: Lustre:
> drpffb-OST0002-osc-ffff880167fda800: Connection to drpffb-OST0002 (at
> 172.21.52.84 at o2ib) was lost; in progress operations using this service
> will wait for recovery to complete
> Apr 25 10:31:19 drp-tst-acc06 kernel: Lustre: Skipped 1 previous similar
> message
> Apr 25 10:31:19 drp-tst-acc06 kernel: LustreError: 167-0:
> drpffb-OST0002-osc-ffff880167fda800: This client was evicted by
> drpffb-OST0002; in progress operations using this service will fail.
> Apr 25 10:31:22 drp-tst-acc06 kernel: LustreError: 11-0:
> drpffb-OST0001-osc-ffff880167fda800: operation ost_statfs to node
> 172.21.52.83 at o2ib failed: rc = -107
> Apr 25 10:31:22 drp-tst-acc06 kernel: Lustre:
> drpffb-OST0001-osc-ffff880167fda800: Connection to drpffb-OST0001 (at
> 172.21.52.83 at o2ib) was lost; in progress operations using this service
> will wait for recovery to complete
> Apr 25 10:31:22 drp-tst-acc06 kernel: LustreError: 167-0:
> drpffb-OST0001-osc-ffff880167fda800: This client was evicted by
> drpffb-OST0001; in progress operations using this service will fail.
> Apr 25 10:31:22 drp-tst-acc06 kernel: LustreError:
> 59702:0:(ldlm_resource.c:1100:ldlm_resource_complain())
> drpffb-OST0001-osc-ffff880167fda800: namespace resource
> [0x4ad:0x0:0x0].0x0 (ffff881004af6e40) refcount nonzero (1) after lock
> cleanup; forcing cleanup.
> Apr 25 10:31:22 drp-tst-acc06 kernel: LustreError:
> 59702:0:(ldlm_resource.c:1682:ldlm_resource_dump()) --- Resource:
> [0x4ad:0x0:0x0].0x0 (ffff881004af6e40) refcount = 2
> Apr 25 10:31:22 drp-tst-acc06 kernel: LustreError:
> 59702:0:(ldlm_resource.c:1682:ldlm_resource_dump()) --- Resource:
> [0x4ad:0x0:0x0].0x0 (ffff881004af6e40) refcount = 2
>
>
> some other info that can be useful:
>
> # lctl get_param  llite.*.max_cached_mb
> llite.drpffb-ffff880167fda800.max_cached_mb=
> users: 5
> max_cached_mb: 64189
> used_mb: 9592
> unused_mb: 54597
> reclaim_count: 0
> llite.drplu-ffff881fe1f99000.max_cached_mb=
> users: 8
> max_cached_mb: 64189
> used_mb: 0
> unused_mb: 64189
> reclaim_count: 0
>
> # lctl get_param ldlm.namespaces.*.lru_size
> ldlm.namespaces.MGC172.21.42.159 at tcp.lru_size=1600
> ldlm.namespaces.MGC172.21.42.213 at tcp.lru_size=1600
> ldlm.namespaces.drpffb-MDT0000-mdc-ffff880167fda800.lru_size=3
> ldlm.namespaces.drpffb-OST0001-osc-ffff880167fda800.lru_size=0
> ldlm.namespaces.drpffb-OST0002-osc-ffff880167fda800.lru_size=2
> ldlm.namespaces.drpffb-OST0003-osc-ffff880167fda800.lru_size=0
> ldlm.namespaces.drplu-MDT0000-mdc-ffff881fe1f99000.lru_size=0
> ldlm.namespaces.drplu-OST0001-osc-ffff881fe1f99000.lru_size=0
> ldlm.namespaces.drplu-OST0002-osc-ffff881fe1f99000.lru_size=0
> ldlm.namespaces.drplu-OST0003-osc-ffff881fe1f99000.lru_size=0
> ldlm.namespaces.drplu-OST0004-osc-ffff881fe1f99000.lru_size=0
> ldlm.namespaces.drplu-OST0005-osc-ffff881fe1f99000.lru_size=0
> ldlm.namespaces.drplu-OST0006-osc-ffff881fe1f99000.lru_size=0
>
>
>
> ldlm.namespaces.MGC172.21.42.159 at tcp.lru_size=1600
> ldlm.namespaces.MGC172.21.42.213 at tcp.lru_size=1600
> ldlm.namespaces.drpffb-MDT0000-mdc-ffff880167fda800.lru_size=3
> ldlm.namespaces.drpffb-OST0001-osc-ffff880167fda800.lru_size=0
> ldlm.namespaces.drpffb-OST0002-osc-ffff880167fda800.lru_size=2
> ldlm.namespaces.drpffb-OST0003-osc-ffff880167fda800.lru_size=0
> ldlm.namespaces.drplu-MDT0000-mdc-ffff881fe1f99000.lru_size=0
> ldlm.namespaces.drplu-OST0001-osc-ffff881fe1f99000.lru_size=0
> ldlm.namespaces.drplu-OST0002-osc-ffff881fe1f99000.lru_size=0
> ldlm.namespaces.drplu-OST0003-osc-ffff881fe1f99000.lru_size=0
> ldlm.namespaces.drplu-OST0004-osc-ffff881fe1f99000.lru_size=0
> ldlm.namespaces.drplu-OST0005-osc-ffff881fe1f99000.lru_size=0
> ldlm.namespaces.drplu-OST0006-osc-ffff881fe1f99000.lru_size=0
>
>
> Rick
>
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org





More information about the lustre-discuss mailing list