[lustre-discuss] Lustre 2.12.1 Network Problem

Petrillo, Neale A (Contractor) Neale.Petrillo at unnpp.gov
Wed May 15 09:57:22 PDT 2019


Hello List!


We're working on creating a new Lustre instance using Lustre 2.12.1 on Centos 7.6 and are running into problems when trying to mount OSTs.


When trying to attach a new OST we get this message on the MDS server:


LustreError:(events.c:305:request_in_callback()) event type 2, status -103, service mgs

LustreError:(pack_generic.c:590:__lustre_unpack_msg()) message length 0 too small for magic/version check

LustreError:(pack_generic.c:590:__lustre_unpack_msg()) Skipped 1 previous similar message

LustreError:(sec.c:2191:sptlrpc_svc_unwrap_request()) error unpacking request from 12345=200.1.20.205 at o2ib x1633617112465632

LustreError:(sec.c:2191:sptlrpc_svc_unwrap_request()) Skipped 1 previous similar message

LustreError:(o2iblnd_cb.c:3325:kiblnd_check_txs_locked()) Timed out tx: active_txs, 0 seconds

LustreError:(o2iblnd_cb.c:3400:kiblnd_check_conns()) Timed out RDMA with 200.1.20.205 at o2ib (6): c: 8, oc: 0, rc: 8

LustreError:(events.c:305:request_in_callback()) event type 2, status -103, service mgs



The confusing thing is that several OSTs had already been mounted successfully and we can find no configuration differences between the OSTs that mount and the OSTs that do not. The network is 100GB Ethernet using RoCE and lnetctl ping completes successfully on all the servers.


Does anybody have any thoughts on what might be causing these errors and any way to address them?


Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20190515/ff681425/attachment.html>


More information about the lustre-discuss mailing list