[lustre-discuss] o2ib (ib_qib) with 2.7.0 rpms on centos 6.6: LNetError: kiblnd_init_rdma: Src buffer exhausted: 1 frags

Lassus, Magnus magnus.lassus at wartsila.com
Wed Nov 18 10:17:13 PST 2015


Hi,

I fail to understand where I go wrong in getting o2ib working using 2.7.0 rpms on top of CentOS 6.6. Running selftest I see:

Nov 17 18:22:40 ss08 kernel: LNet: Added LNI 10.165.32.18 at o2ib [8/256/0/180]
Nov 17 18:24:40 ss08 kernel: LNetError: 12532:0:(o2iblnd_cb.c:1123:kiblnd_init_rdma()) Src buffer exhausted: 1 frags
Nov 17 18:24:40 ss08 kernel: LustreError: 12553:0:(brw_test.c:212:brw_check_page()) Bad data in page ffffea0070c20800: 0xbeefbeefbeefbeef, 0xeeb0eeb1eeb2eeb3 expec
Nov 17 18:24:40 ss08 kernel: LustreError: 12553:0:(brw_test.c:238:brw_check_bulk()) Bulk page ffffea0070c20800 (0/256) is corrupted!
Nov 17 18:24:40 ss08 kernel: LustreError: 12553:0:(brw_test.c:343:brw_client_done_rpc()) Bulk data from 12345-10.165.32.18 at o2ib is corrupted!
Nov 17 18:24:40 ss08 kernel: LNetError: 12532:0:(o2iblnd_cb.c:1690:kiblnd_reply()) Can't setup rdma for GET from 10.165.32.18 at o2ib: -71
Nov 17 18:25:31 ss08 kernel: LNetError: 12529:0:(o2iblnd_cb.c:3036:kiblnd_check_txs_locked()) Timed out tx: active_txs, 0 seconds
Nov 17 18:25:31 ss08 kernel: LNetError: 12529:0:(o2iblnd_cb.c:3099:kiblnd_check_conns()) Timed out RDMA with 10.165.32.18 at o2ib (0): c: 7, oc: 0, rc: 7
Nov 17 18:25:31 ss08 kernel: LustreError: 12558:0:(brw_test.c:388:brw_bulk_ready()) BRW bulk WRITE failed for RPC from 12345-10.165.32.18 at o2ib: -103
Nov 17 18:25:31 ss08 kernel: LustreError: 12558:0:(brw_test.c:362:brw_server_rpc_done()) Bulk transfer from 12345-10.165.32.18 at o2ib has failed: -5
Nov 17 18:25:48 ss08 kernel: LNet: 12581:0:(rpc.c:1077:srpc_client_rpc_expired()) Client RPC expired: service 11, peer 12345-10.165.32.18 at o2ib, timeout 64.
Nov 17 18:25:48 ss08 kernel: LustreError: 12555:0:(brw_test.c:318:brw_client_done_rpc()) BRW RPC to 12345-10.165.32.18 at o2ib failed with -110

# rpm -qa | egrep 'lustre|kernel' | sort
dracut-kernel-004-356.el6.noarch
kernel-2.6.32-504.8.1.el6_lustre.x86_64
kernel-devel-2.6.32-504.8.1.el6_lustre.x86_64
kernel-firmware-2.6.32-504.8.1.el6_lustre.x86_64
kernel-headers-2.6.32-504.8.1.el6_lustre.x86_64
lustre-2.7.0-2.6.32_504.8.1.el6_lustre.x86_64.x86_64
lustre-iokit-2.7.0-2.6.32_504.8.1.el6_lustre.x86_64.x86_64
lustre-modules-2.7.0-2.6.32_504.8.1.el6_lustre.x86_64.x86_64
lustre-osd-ldiskfs-2.7.0-2.6.32_504.8.1.el6_lustre.x86_64.x86_64
lustre-osd-ldiskfs-mount-2.7.0-2.6.32_504.8.1.el6_lustre.x86_64.x86_64
lustre-tests-2.7.0-2.6.32_504.8.1.el6_lustre.x86_64.x86_64
perf-2.6.32-504.8.1.el6_lustre.x86_64
python-perf-2.6.32-504.8.1.el6_lustre.x86_64

Using latest 2.7.63 build on 6.7 works.

Any pointers are warmly welcome as I'd prefer to use 2.7.0.

Regards,
Magnus

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20151118/bc19b61a/attachment.htm>


More information about the lustre-discuss mailing list