[Lustre-discuss] Lustre MDS Errors 1-7 and operation 101
adilger at sun.com
Thu Jan 15 07:56:28 PST 2009
On Jan 14, 2009 11:34 +0100, Thomas Roth wrote:
> Jan 14 10:44:33 server1 kernel: LustreError:
> 5118:0:(ldlm_lib.c:1536:target_send_reply_msg()) @@@ processing error
> (-107) req at ffff8107fd6c4c50 x2077599/t0 o101-><?>@<?>:0/0 lens 232/0 e
> 0 to 0 dl 1231927273 ref 1 fl Interpret:/0/0 rc -107/0
> Jan 14 10:46:42 server1 kernel: LustreError:
> 6766:0:(mgs_handler.c:557:mgs_handle()) lustre_mgs: operation 101 on
> unconnected MGS
> error (-107) is /* Transport endpoint is not connected */ - I have
> seen this before on clients which had lost the connection to the
> cluster. But this is on the MGS/MDS - one server with one partition for
> the MGS and one for the MDT.
> The second error suggests of course that the MGS is actually not
> connected - but how can a Lustre system run when its MGS isn't there?
> Makes no sense, does it?
It means some client is trying to perform operations on the MGS before
it is connected.
> O.k., the cluster is running Debian Etch 64bit, Kernel 2.6.22, Lustre
> 188.8.131.52. The "operation 101" thing is supposed to have been solved in
> the 1.6.4 -> 1.6.5 upgrade, according to the change logs.
There are a million things that might cause "operation 101" problems.
101 = LDLM_ENQUEUE, so this is just a lock enqueue.
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
More information about the lustre-discuss