[Lustre-discuss] Nodes claim error with files, then say everything is fine.

Brian J. Murrell Brian.Murrell at Sun.COM
Wed Aug 6 08:15:25 PDT 2008


On Wed, 2008-08-06 at 08:24 -0600, Chris Worley wrote:
> On Wed, Aug 6, 2008 at 8:19 AM, Brian J. Murrell <Brian.Murrell at sun.com> wrote:
> >
> > Are there any Lustre messages in your syslog on the clients which are
> > returning unexpected results?
> 
> Indeed.  These clients seem to be timing out and getting evicted:
> 
> LustreError: 11-0: an error occurred while communicating with
> 36.102.29.1 at o2ib. The ldlm_enqueue operation failed with -107
> LustreError: Skipped 36 previous similar messages
> Lustre: lfs-MDT0000-mdc-ffff81027cca8d20: Connection to service
> lfs-MDT0000 via nid 36.102.29.1 at o2ib was lost; in progress operations
> using this service will wait for recovery to complete.
> Lustre: Skipped 36 previous similar messages
> LustreError: 167-0: This client was evicted by lfs-MDT0000; in
> progress operations using this service will fail.

So, now what does the MDS serving lfs-MDT0000 say about this?  Why did
it evict?  What version of Lustre is this?  Perhaps you said so already
and I have just forgotten.

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20080806/6b1b5cb6/attachment.pgp>


More information about the lustre-discuss mailing list