[lustre-discuss] RDMA too many fragments/timed out - clients slowing entire filesystem performance

Brian W. Johanson bjohanso at psc.edu
Tue Nov 1 17:18:10 PDT 2016


Great, thanks Doug!

Quotas are not enabled.

There are a few nodes that were exhibiting the issue fairly consistently.  We 
have recently added  70 clients (~900 total) which seems to have caused this to 
happen more frequently.

-b


On 11/01/2016 07:57 PM, Oucharek, Doug S wrote:
> Hi Brian,
>
> You need this patch: http://review.whamcloud.com/#/c/12451.  It has not landed 
> to master yet and is off by default.  To activate it, add this module 
> parameter line to your nodes (all of them):
>
> options ko2iblnd wrq_sge=2
>
> The issue is that something is causing an offset to be introduced to the bulk 
> transfers.  That causes a misalignment of the source and destination 
> fragments.  Due to how the algorithm works, you require twice as many 
> descriptors to the fragments to do the RDMA operation.  So, you are running 
> out of descriptors when you are only halfway done configuring the transfer. 
>  The above patch creates two sets of descriptors so the second set can be 
> utilized in situations like this.  The fix operates on the nodes which are 
> doing the bulk transfers.  Since you can both read and write bulk data, you 
> need the fix on server, clients, and LNet routers (basically, everywhere).
>
> Question: are you using the quotas feature and could it be at or approaching a 
> limit?  There has been some evidence that the quotas feature could be 
> introducing the offset to bulk transfers.
>
> Doug
>
>> On Nov 1, 2016, at 4:08 PM, Brian W. Johanson <bjohanso at psc.edu 
>> <mailto:bjohanso at psc.edu>> wrote:
>>
>> Centos 7.2
>> Lustre 2.8.0
>> ZFS 0.6.5.5
>> OPA 10.2.0.0.158
>>
>>
>> The clients and servers are on the same OPA network, no routing.  Once a 
>> client gets in this state, the filesystem performance drops to a faction of 
>> what it is capable of.
>> The client must be rebooted to clear the issue.
>>
>> I imagine I am missing a bug in jira for this issue, does this look like a 
>> known issue?
>>
>>
>> Pertinent debug messages from the server:
>>
>> 00000800:00020000:34.0:1478026118.277782:0:29892:0:(o2iblnd_cb.c:3109:kiblnd_check_txs_locked()) 
>> Timed out tx: active_txs, 4 seconds
>> 00000800:00020000:34.0:1478026118.277785:0:29892:0:(o2iblnd_cb.c:3172:kiblnd_check_conns()) 
>> Timed out RDMA with 10.4.119.112 at o2ib (3): c: 112, oc: 0, rc: 66
>> 00000800:00000100:34.0:1478026118.277787:0:29892:0:(o2iblnd_cb.c:1913:kiblnd_close_conn_locked()) 
>> Closing conn to 10.4.119.112 at o2ib: error -110(waiting)
>> 00000100:00020000:34.0:1478026118.277844:0:29892:0:(events.c:447:server_bulk_callback()) 
>> event type 5, status -103, desc ffff883e8e8bcc00
>> 00000100:00020000:34.0:1478026118.288714:0:29892:0:(events.c:447:server_bulk_callback()) 
>> event type 3, status -103, desc ffff883e8e8bcc00
>> 00000100:00020000:34.0:1478026118.299574:0:29892:0:(events.c:447:server_bulk_callback()) 
>> event type 5, status -103, desc ffff8810e92e9c00
>> 00000100:00020000:34.0:1478026118.310434:0:29892:0:(events.c:447:server_bulk_callback()) 
>> event type 3, status -103, desc ffff8810e92e9c00
>>
>>
>> And from the client:
>>
>> 00000400:00000100:8.0:1477949860.565777:0:3629:0:(lib-move.c:1489:lnet_parse_put()) 
>> Dropping PUT from 12345-10.4.108.81 at o2ib portal 4 match 1549728742532740 
>> offset 0 length 192: 4
>> 00000400:00000100:8.0:1477949860.565782:0:3629:0:(lib-move.c:1489:lnet_parse_put()) 
>> Dropping PUT from 12345-10.4.108.81 at o2ib portal 4 match 1549728742532740 
>> offset 0 length 192: 4
>> 00000800:00020000:8.0:1477949860.702666:0:3629:0:(o2iblnd_cb.c:1094:kiblnd_init_rdma()) 
>> RDMA has too many fragments for peer 10.4.108.81 at o2ib (32), src idx/frags: 
>> 16/27 dst idx/frags: 16/27
>> 00000800:00020000:8.0:1477949860.702667:0:3629:0:(o2iblnd_cb.c:1689:kiblnd_reply()) 
>> Can't setup rdma for GET from 10.4.108.81 at o2ib: -90
>> 00000100:00020000:8.0:1477949860.702669:0:3629:0:(events.c:201:client_bulk_callback()) 
>> event type 1, status -5, desc ffff880fd5d9bc00
>> 00000800:00020000:8.0:1477949860.816666:0:3629:0:(o2iblnd_cb.c:1094:kiblnd_init_rdma()) 
>> RDMA has too many fragments for peer 10.4.108.81 at o2ib (32), src idx/frags: 
>> 16/27 dst idx/frags: 16/27
>> 00000800:00020000:8.0:1477949860.816668:0:3629:0:(o2iblnd_cb.c:1689:kiblnd_reply()) 
>> Can't setup rdma for GET from 10.4.108.81 at o2ib: -90
>> 00000100:00020000:8.0:1477949860.816669:0:3629:0:(events.c:201:client_bulk_callback()) 
>> event type 1, status -5, desc ffff880fd5d9bc00
>> 00000400:00000100:8.0:1477949861.573660:0:3629:0:(lib-move.c:1489:lnet_parse_put()) 
>> Dropping PUT from 12345-10.4.108.81 at o2ib portal 4 match 1549728742532740 
>> offset 0 length 192: 4
>> 00000400:00000100:8.0:1477949861.573664:0:3629:0:(lib-move.c:1489:lnet_parse_put()) 
>> Dropping PUT from 12345-10.4.108.81 at o2ib portal 4 match 1549728742532740 
>> offset 0 length 192: 4
>> 00000400:00000100:8.0:1477949861.573667:0:3629:0:(lib-move.c:1489:lnet_parse_put()) 
>> Dropping PUT from 12345-10.4.108.81 at o2ib portal 4 match 1549728742532740 
>> offset 0 length 192: 4
>> 00000400:00000100:8.0:1477949861.573669:0:3629:0:(lib-move.c:1489:lnet_parse_put()) 
>> Dropping PUT from 12345-10.4.108.81 at o2ib portal 4 match 1549728742532740 
>> offset 0 length 192: 4
>> 00000400:00000100:8.0:1477949861.573671:0:3629:0:(lib-move.c:1489:lnet_parse_put()) 
>> Dropping PUT from 12345-10.4.108.81 at o2ib portal 4 match 1549728742532740 
>> offset 0 length 192: 4
>> 00000400:00000100:8.0:1477949861.573673:0:3629:0:(lib-move.c:1489:lnet_parse_put()) 
>> Dropping PUT from 12345-10.4.108.81 at o2ib portal 4 match 1549728742532740 
>> offset 0 length 192: 4
>> 00000400:00000100:8.0:1477949861.573675:0:3629:0:(lib-move.c:1489:lnet_parse_put()) 
>> Dropping PUT from 12345-10.4.108.81 at o2ib portal 4 match 1549728742532740 
>> offset 0 length 192: 4
>> 00000400:00000100:8.0:1477949861.573677:0:3629:0:(lib-move.c:1489:lnet_parse_put()) 
>> Dropping PUT from 12345-10.4.108.81 at o2ib portal 4 match 1549728742532740 
>> offset 0 length 192: 4
>> 00000800:00020000:8.0:1477949861.721668:0:3629:0:(o2iblnd_cb.c:1094:kiblnd_init_rdma()) 
>> RDMA has too many fragments for peer 10.4.108.81 at o2ib (32), src idx/frags: 
>> 16/27 dst idx/frags: 16/27
>> 00000800:00020000:8.0:1477949861.721669:0:3629:0:(o2iblnd_cb.c:1689:kiblnd_reply()) 
>> Can't setup rdma for GET from 10.4.108.81 at o2ib: -90
>> 00000100:00020000:8.0:1477949861.721670:0:3629:0:(events.c:201:client_bulk_callback()) 
>> event type 1, status -5, desc ffff880fd5d9bc00
>> 00000800:00020000:8.0:1477949861.836668:0:3629:0:(o2iblnd_cb.c:1094:kiblnd_init_rdma()) 
>> RDMA has too many fragments for peer 10.4.108.81 at o2ib (32), src idx/frags: 
>> 16/27 dst idx/frags: 16/27
>> 00000800:00020000:8.0:1477949861.836669:0:3629:0:(o2iblnd_cb.c:1689:kiblnd_reply()) 
>> Can't setup rdma for GET from 10.4.108.81 at o2ib: -90
>> 00000100:00020000:8.0:1477949861.836670:0:3629:0:(events.c:201:client_bulk_callback()) 
>> event type 1, status -5, desc ffff880fd5d9bc00
>> 00000800:00020000:8.0:1477949862.061668:0:3629:0:(o2iblnd_cb.c:1094:kiblnd_init_rdma()) 
>> RDMA has too many fragments for peer 10.4.108.81 at o2ib (32), src idx/frags: 
>> 16/27 dst idx/frags: 16/27
>> 00000800:00020000:8.0:1477949862.061669:0:3629:0:(o2iblnd_cb.c:1689:kiblnd_reply()) 
>> Can't setup rdma for GET from 10.4.108.81 at o2ib: -90
>> 00000100:00020000:8.0:1477949862.061670:0:3629:0:(events.c:201:client_bulk_callback()) 
>> event type 1, status -5, desc ffff880fd5d9bc00
>>
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org <mailto:lustre-discuss at lists.lustre.org>
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20161101/0618777e/attachment.htm>


More information about the lustre-discuss mailing list