[lustre-discuss] Added OSTs, now lnet errors

Brett Lee brettlee.lustre at gmail.com
Sun Dec 11 14:46:04 PST 2016


Steve, It might be the network that LNet is running on.  Have you run some
bandwidth tests without LNet to check for network problems?
On Dec 11, 2016 3:37 PM, "Steve Barnet" <barnet at icecube.wisc.edu> wrote:

> Hi all,
>
>   Seeing something very strange. I recently added two OSSes
> and 10 OSTs to one of our filesystems. Things look OK under
> light loads, but when we load them up, we start seeing lots
> of LNet errors.
>
> OS: Scientific Linux 6.7
> Lustre - Server: 2.8.0 Community version
> Lustre - Client: 2.5.3
>
> The errors are below. Do these narrow the range of possible
> problems?
>
>
> Dec 11 11:17:39 lfs-ex-oss-20 kernel: LNetError:
> 7732:0:(socklnd_cb.c:2509:ksocknal_check_peer_timeouts()) Total 4 stale
> ZC_REQs for peer 10.128.10.29 at tcp1 detected; the oldest(ffff880f6a90e000)
> timed out 7 secs ago, resid: 0, wmem: 0
> Dec 11 11:17:39 lfs-ex-oss-20 kernel: LustreError:
> 7732:0:(events.c:447:server_bulk_callback()) event type 5, status -5,
> desc ffff8805379f8000
> Dec 11 11:17:39 lfs-ex-oss-20 kernel: LustreError:
> 7732:0:(events.c:447:server_bulk_callback()) event type 5, status -5,
> desc ffff880f375dc000
> Dec 11 11:17:39 lfs-ex-oss-20 kernel: LustreError:
> 8234:0:(ldlm_lib.c:3175:target_bulk_io()) @@@ network error on bulk READ
> req at ffff880e506263c0 x1551187318090340/t0(0)
> o3->092e941d-272a-09e3-502b-9338dbf387d3 at 10.128.10.29@tcp1:587/0 lens
> 488/432 e 3 to 0 dl 1481476687 ref 1 fl Interpret:/0/0 rc 0/0
> Dec 11 11:17:39 lfs-ex-oss-20 kernel: LustreError:
> 8234:0:(ldlm_lib.c:3175:target_bulk_io()) Skipped 1 previous similar
> message
> Dec 11 11:17:39 lfs-ex-oss-20 kernel: Lustre: lfs2-OST0024: Bulk IO read
> error with 092e941d-272a-09e3-502b-9338dbf387d3 (at 10.128.10.29 at tcp1),
> client will retry: rc -110
> Dec 11 11:17:39 lfs-ex-oss-20 kernel: LustreError:
> 7732:0:(events.c:447:server_bulk_callback()) event type 5, status -5,
> desc ffff8804db0ce000
> Dec 11 11:17:39 lfs-ex-oss-20 kernel: LustreError:
> 7732:0:(events.c:447:server_bulk_callback()) event type 5, status -5,
> desc ffff880aa4374000
>
>
> Thanks much!
>
> Best,
>
> ---Steve
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20161211/a56aae49/attachment.htm>


More information about the lustre-discuss mailing list