[lustre-discuss] Lustre 2.12.1 Network Problem
Petrillo, Neale A (Contractor)
Neale.Petrillo at unnpp.gov
Wed May 15 09:57:22 PDT 2019
Hello List!
We're working on creating a new Lustre instance using Lustre 2.12.1 on Centos 7.6 and are running into problems when trying to mount OSTs.
When trying to attach a new OST we get this message on the MDS server:
LustreError:(events.c:305:request_in_callback()) event type 2, status -103, service mgs
LustreError:(pack_generic.c:590:__lustre_unpack_msg()) message length 0 too small for magic/version check
LustreError:(pack_generic.c:590:__lustre_unpack_msg()) Skipped 1 previous similar message
LustreError:(sec.c:2191:sptlrpc_svc_unwrap_request()) error unpacking request from 12345=200.1.20.205 at o2ib x1633617112465632
LustreError:(sec.c:2191:sptlrpc_svc_unwrap_request()) Skipped 1 previous similar message
LustreError:(o2iblnd_cb.c:3325:kiblnd_check_txs_locked()) Timed out tx: active_txs, 0 seconds
LustreError:(o2iblnd_cb.c:3400:kiblnd_check_conns()) Timed out RDMA with 200.1.20.205 at o2ib (6): c: 8, oc: 0, rc: 8
LustreError:(events.c:305:request_in_callback()) event type 2, status -103, service mgs
The confusing thing is that several OSTs had already been mounted successfully and we can find no configuration differences between the OSTs that mount and the OSTs that do not. The network is 100GB Ethernet using RoCE and lnetctl ping completes successfully on all the servers.
Does anybody have any thoughts on what might be causing these errors and any way to address them?
Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20190515/ff681425/attachment.html>
More information about the lustre-discuss
mailing list