[Lustre-discuss] lock callback timer expired: evicting client

Brian J. Murrell Brian.Murrell at Sun.COM
Fri Dec 14 08:57:41 PST 2007


On Fri, 2007-12-14 at 17:41 +0100, Per Lundqvist wrote:

> /var/log/messages on evicted client:
>    Dec 14 16:31:26 n75 kernel: LustreError: 11-0: an error occurred while communicating with 192.168.11.226 at tcp. The ldlm_enqueue operation failed with -107
>    Dec 14 16:31:26 n75 kernel: Lustre: MDC_mds2_misu2_mds_MNT_misu2_client-000001007e011400: Connection to service misu2_mds via nid 192.168.11.226 at tcp was lost; in progress operations us
>    ing this service will wait for recovery to complete.
>    Dec 14 16:31:26 n75 kernel: LustreError: 167-0: This client was evicted by misu2_mds; in progress operations using this service will fail.
>    Dec 14 16:31:26 n75 kernel: LustreError: 5484:0:(mdc_locks.c:423:mdc_finish_enqueue()) ldlm_cli_enqueue: -5
>    Dec 14 16:31:26 n75 kernel: LustreError: 5484:0:(client.c:519:ptlrpc_import_delay_req()) @@@ IMP_INVALID  req at 000001007dc87400 x854/t0 o101->misu2_mds_UUID at mds2_UUID:12 lens 392/728 re
>    f 1 fl Rpc:/0/0 rc 0/0
>    Dec 14 16:31:26 n75 kernel: Lustre: MDC_mds2_misu2_mds_MNT_misu2_client-000001007e011400: Connection restored to service misu2_mds using nid 192.168.11.226 at tcp.
>    Dec 14 16:33:06 n75 kernel: LustreError: 11-0: an error occurred while communicating with 192.168.11.226 at tcp. The ldlm_enqueue operation failed with -107
>    Dec 14 16:33:06 n75 kernel: Lustre: MDC_mds2_misu2_mds_MNT_misu2_client-000001007e011400: Connection to service misu2_mds via nid 192.168.11.226 at tcp was lost; in progress operations us
>    ing this service will wait for recovery to complete.
>    Dec 14 16:33:06 n75 kernel: LustreError: 167-0: This client was evicted by misu2_mds; in progress operations using this service will fail.
>    Dec 14 16:33:06 n75 kernel: LustreError: 5484:0:(mdc_locks.c:423:mdc_finish_enqueue()) ldlm_cli_enqueue: -5
>    Dec 14 16:33:06 n75 kernel: LustreError: 5484:0:(mdc_locks.c:423:mdc_finish_enqueue()) Skipped 1 previous similar message
>    Dec 14 16:33:06 n75 kernel: Lustre: MDC_mds2_misu2_mds_MNT_misu2_client-000001007e011400: Connection restored to service misu2_mds using nid 192.168.11.226 at tcp.
>    ....<snip>..
>    
> /var/log/messages on MDS:
>    Dec 14 16:31:26 mds2 kernel: LustreError: 0:0:(ldlm_lockd.c:205:waiting_locks_callback()) ### lock callback timer expired: evicting client e689a9c4-2a46-9239-7fd7-5a7e2c8c6542 at NET_0x20000c0a80b4b_UUID nid 192.168.11.75 at tcp  ns: mds-misu2_mds_UUID lock: 0000010075f50700/0x959917d3f3b88931 lrc: 1/0,0 mode: CR/CR res: 4780751/2726154257 bits 0x3 rrc: 7 type: IBT flags: 30 remote: 0x2a6036f763199bff expref: 7 pid 5373
>    Dec 14 16:31:26 mds2 kernel: LustreError: 0:0:(ldlm_lockd.c:205:waiting_locks_callback()) Skipped 1 previous similar message
>    Dec 14 16:31:26 mds2 kernel: Lustre: 5376:0:(mds_reint.c:125:mds_finish_transno()) commit transaction for disconnected client e689a9c4-2a46-9239-7fd7-5a7e2c8c6542: rc 0
>    Dec 14 16:31:26 mds2 kernel: LustreError: 5394:0:(handler.c:1478:mds_handle()) operation 101 on unconnected MDS from 12345-192.168.11.75 at tcp
>    Dec 14 16:31:26 mds2 kernel: LustreError: 5394:0:(handler.c:1478:mds_handle()) Skipped 1 previous similar message
>    Dec 14 16:31:26 mds2 kernel: LustreError: 5394:0:(ldlm_lib.c:1343:target_send_reply_msg()) @@@ processing error (-107)  req at 000001012551c450 x852/t0 o101-><?>@<?>:-1 lens 392/0 ref 0 fl Interpret:/0/0 rc -107/0
>    Dec 14 16:31:26 mds2 kernel: LustreError: 5394:0:(ldlm_lib.c:1343:target_send_reply_msg()) Skipped 2 previous similar messages
>    Dec 14 16:33:06 mds2 kernel: LustreError: 0:0:(ldlm_lockd.c:205:waiting_locks_callback()) ### lock callback timer expired: evicting client e689a9c4-2a46-9239-7fd7-5a7e2c8c6542 at NET_0x20000c0a80b4b_UUID nid 192.168.11.75 at tcp  ns: mds-misu2_mds_UUID lock: 00000100b829fb00/0x959917d3f3b89ff1 lrc: 1/0,0 mode: CR/CR res: 4780751/2726154257 bits 0x3 rrc: 8 type: IBT flags: 30 remote: 0x2a6036f763199c3e expref: 6 pid 5398
>    Dec 14 16:33:06 mds2 kernel: Lustre: 5373:0:(mds_reint.c:125:mds_finish_transno()) commit transaction for disconnected client e689a9c4-2a46-9239-7fd7-5a7e2c8c6542: rc 0
>    Dec 14 16:33:06 mds2 kernel: LustreError: 5372:0:(handler.c:1478:mds_handle()) operation 101 on unconnected MDS from 12345-192.168.11.75 at tcp
>    Dec 14 16:33:06 mds2 kernel: LustreError: 5372:0:(ldlm_lib.c:1343:target_send_reply_msg()) @@@ processing error (-107)  req at 0000010027297800 x927/t0 o101-><?>@<?>:-1 lens 392/0 ref 0 fl Interpret:/0/0 rc -107/0
>    ....<snip>...

This looks like 13917, attachment 13540.

b.





More information about the lustre-discuss mailing list