[Lustre-discuss] LustreError: 11-0: an error occurred while communicating with 192.168.16.24 at o2ib. The ost_connect operation failed with -19

Dennis Nelson dnelson at sgi.com
Tue Mar 24 16:30:39 PDT 2009


Hi,

I have encountered an issue with Lustre that has happened a couple of times
now.  I am beginning to suspect an issue with the IB fabric but wanted to
reach out to the list to confirm my suspicions.  The odd part is that even
when the MDS complains that it cannot connect to a given ost, lctl ping to
the OSS that owns the OST works without an issue.  Also, the OSS in question
has other OSTs which, in the latest case, have not reported any errors.

I have attached a file with the errors that I encountered from the MDS.  I
am running Lustre 1.6.6 with a a pair of MDSs and 8 OSS and 28 OSTs spread
across the the 8 OSSs.  I am using IB DDR interconnects between all systems.

Thanks,

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20090324/c25b0e15/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: errors
Type: application/octet-stream
Size: 33745 bytes
Desc: errors
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20090324/c25b0e15/attachment.obj>


More information about the lustre-discuss mailing list