[Lustre-discuss] Lustre filesystem hangs when reading large files

Chris Exton cexton at ocf.co.uk
Wed Apr 20 02:08:33 PDT 2011


Hello,

We are currently using lustre 1.8.1.1 and using kernel version 2.6.18_128.7.1.el5_lustre.

We are experiencing problems when performing reads of large files from my lustre filesystem, small reads are not affected.

The read process hangs and the following message is reported in /var/log/messages:

Feb 22 15:59:38 leopard kernel: LustreError: 11-0: an error occurred while communicating with 192.168.13.200 at o2ib. The obd_ping operation failed with -107
Feb 22 15:59:38 leopard kernel: Lustre: lustre-OST0000-osc-ffff81067e0eac00: Connection to service lustre-OST0000 via nid 192.168.13.200 at o2ib was lost; in progress operations using this service will wait for recovery to complete.
Feb 22 15:59:38 leopard kernel: LustreError: 6811:0:(import.c:939:ptlrpc_connect_interpret()) lustre-OST0000_UUID went back in time (transno 476754140074 was previously committed, server now claims 0)!  See https://bugzilla.lustre.org/show_bug.cgi?id=9646
Feb 22 15:59:38 leopard kernel: LustreError: 167-0: This client was evicted by lustre-OST0000; in progress operations using this service will fail.
Feb 22 15:59:38 leopard kernel: Lustre: lustre-OST0000-osc-ffff81067e0eac00: Connection restored to service lustre-OST0000 using nid 192.168.13.200 at o2ib.
Feb 22 15:59:38 leopard kernel: LustreError: 17592:0:(lov_request.c:196:lov_update_enqueue_set()) enqueue objid 0x18f87222 subobj 0x4d0c9f on OST idx 0: rc -5

I have checked the bugzilla report but we have not had a disk crash and the system was not restarted. Could this be an underlying hardware problem that's not getting logged?

Any additional help on this matter would be much appreciated.

Kind Regards

Chris


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20110420/fa3a46fd/attachment.htm>


More information about the lustre-discuss mailing list