[Lustre-discuss] Lustre clients under Xen

Lukas Hejtmanek xhejtman at ics.muni.cz
Tue Nov 30 11:14:55 PST 2010


Hello,

I see some oddities on Lustre clients running under Xen DomU. I got messages
like this:
Jul 29 14:35:23 quark8-1 kernel: Lustre: Request x2674628 sent from
stable-OST0001-osc-ffff8801a72a5000 to NID 147.251.9.9 at tcp 100s ago has
timed out (limit 100s).
Jul 29 14:35:23 quark8-1 kernel: Lustre:
stable-OST0001-osc-ffff8801a72a5000: Connection to service
stable-OST0001 via nid 147.251.9.9 at tcp was lost; in progress operations
using this service will wait for recovery to complete.
Jul 29 14:35:23 quark8-1 kernel: LustreError:
128:0:(ldlm_request.c:1033:ldlm_cli_cancel_req()) Got rc -11 from cancel
RPC: canceling anyway
Jul 29 14:35:23 quark8-1 kernel: LustreError:
128:0:(ldlm_request.c:1622:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -11
Jul 29 14:35:23 quark8-1 kernel: Lustre:
stable-OST0001-osc-ffff8801a72a5000: Connection restored to service
stable-OST0001 using nid 147.251.9.9 at tcp.

The network is OK all the time. I tried both 1.6.x and 1.8.x Lustre. All the
same. Moreover, from time to time, Lustre fs gets stuck in:
[<ffffffff882c5ef3>] :mdc:mdc_close+0x1e3/0x7a0
[<ffffffff88332f53>] :lustre:ll_close_inode_openhandle+0x1e3/0x650
[<ffffffff88333a05>] :lustre:ll_mdc_real_close+0x115/0x370
[<ffffffff883691e1>] :lustre:ll_mdc_blocking_ast+0x1d1/0x570
[<ffffffff88186720>] :ptlrpc:ldlm_cancel_callback+0x50/0xd0
[<ffffffff881a0721>] :ptlrpc:ldlm_cli_cancel_local+0x61/0x350
[<ffffffff881a2025>] :ptlrpc:ldlm_cancel_lru_local+0x165/0x340
[<ffffffff881a14c7>] :ptlrpc:ldlm_cli_cancel_list+0xf7/0x380
[<ffffffff881a2263>] :ptlrpc:ldlm_cancel_lru+0x63/0x1b0
[<ffffffff881b62d7>] :ptlrpc:ldlm_cli_pool_shrink+0xf7/0x240
[<ffffffff881b365d>] :ptlrpc:ldlm_pool_shrink+0x2d/0xe0
[<ffffffff881b48fb>] :ptlrpc:ldlm_pools_shrink+0x25b/0x330
[<ffffffff8025c705>] shrink_slab+0xe2/0x15a

when the DomU is being suspended (most memory and CPU is stolen by another
DomU).

Is this something known or unsupported? (I.e., running Lustre under Xen with
domains preemption)

-- 
Lukáš Hejtmánek



More information about the lustre-discuss mailing list