[lustre-discuss] operation ldlm_queue failed with -11
paf at cray.com
Wed May 3 09:23:18 PDT 2017
That reasoning is sound, but this is a special case. -11 (-EAGAIN) on ldlm_enqueue is generally OK...
LU-8658 explains the situation (it's POSIX flocks), so I'm going to reference that rather than repeat it here.
From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of Mohr Jr, Richard Frank (Rick Mohr) <rmohr at utk.edu>
Sent: Wednesday, May 3, 2017 11:07:53 AM
To: Lydia Heck
Cc: lustre-discuss at lists.lustre.org
Subject: Re: [lustre-discuss] operation ldlm_queue failed with -11
I think that -11 is EAGAIN, but I don’t know how to interpret what that means in the context of Lustre locking. I assume these messages are from the clients and the changing “xxxxx” portion is just the fact that each client has a different identifier. So if you have multiple clients complaining about errors to the same MDS server, then my first guess would be that there is some wrong on the server side of things.
Senior HPC System Administrator
National Institute for Computational Sciences
> On May 2, 2017, at 4:52 AM, Lydia Heck <lydia.heck at durham.ac.uk> wrote:
> Dear all,
> we get many entries in our logs of the type
> kernel: LustreError: 11-0: scratch-MDT0000-mdc-xxxxxxxxxxxxxx: Communicating with 172.17.xxx.yyy at o2ib, operation ldlm_enqueue failed with -11
> with the -xxxxxxxxxxxxxx changing
> but to the same MDS system?
> I have looked on the internet, but fail to find this error. There is very little info on ldlm_enqueue messages.
> Best wishes,
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
lustre-discuss mailing list
lustre-discuss at lists.lustre.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the lustre-discuss